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STATISTICS AND EPIDEMIOLOGY 


"Nevertheless, in a real sense, statistics is 
the study of populations, or aggregates of 
individuals, rather than of individuals. 
Scientific theories which involve the properties 
of large aggregates of individuals, and not 
necessarily the properties of the individuals 
themselves . . . are essentially statistical 
arguments, and are liable to misinterpretations 
as soon as the statistical nature of the 
argument is lost sight of." 


Sir Ronald Fisher 
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Statistical significance and confidence intervals 


X JW" mny papers in fhc Journal use 

l^/I •laiiKicai methods and one of th« 
X ▼ aims of ihe review procoi is to cry 
to ensure that appropriate methods have 
been used. Often papers report results of 
comparative studies that art designed to 
answer questions such as whether one 
treatment is superior to another for a 
particular disease, or whether there is an 
association between some form of behaviour 
(for example, taking regular exercise or 
smoking) and the occurrence of some 
disease. Comparative studies are almost 
invariably carried out on a sample of 
individuals who are chosen from the 
population of individuals to whom it is 
intended to generalize the results. Data are 
collected on the sample in order to make 
inferences on the population. Valid 
inferences can only be drawn if the sample 
is chosen in such a way that it is represen¬ 
tative of the population. Otherwise a bias 
could occur; epidemiological methods are 
designed to eliminate such biases. 

Since the aim of a statistical analysis is to 
make inferences, it is paramount to express 
whatever inferences that can be drawn in the 
most informative way: There are several 
methods of statistical inference, but the two 
that are most commonly used are 
significance testing and confidence interval 
estimation. The former is well known and 
is featured by quoting P values. Many 
authors appear to be under the impression 
that a profusion of P values is necessary: 
regrettably this impression has been bolstered 
in the past by editors of biological (journals. 
Significance testing has its place but. as 
mentioned by Healy in 1978,' “it is widely 
agreed among statisticians (if less so among 
the more naive users of statistics) that 
significance testing is not the be-all and end* 
all of the subject". In this leading article I 
would like to discuss the characteristics of; 
both methods of inference, show that a 
confidence interval contains the result of a 
significance test, but not vice versa, and 
suggest that confidence intervals are the 
answers to the more interesting questions 
that data can be used to answer. 

Any particular study is based on a 
particular sample; however, it is useful to 
imagine that the study is repeated with a 
different sample being selected each time. 
These hypothetical studies will give different 
results because They contain different 
individuals, and individuals vary in any 
characteristic because of biological varia¬ 
bility. The differences are termed sampling 
variability. It follows then that the results 
that are obtained from a particular sample 
can only be taken as an approximation to the 
actual situation in the whole population. 
Statistical methods are concerned wt»K 
assessing the degree of approximation and 


what may be reasonably inferred, given that 
a different sample would have produced a 
different result; 

The methods are based on the assumption 
that it is a matter of chance which particular 
subjects art in the sample that is being 
studied, and the sampling variability is thus 
random variation which is determined by the 
laws of probability. Therefore, the inferences 
art expressed Ih terms of probability. The 
situation is illustrated below. 

Population 


i- 

Sampla data 


sampling variation 


uncertainty 


Inferences on population 

Taking a samplfe from the population 
involves sampling variation. As a conse¬ 
quence of this, inferences from the sample 
data back to the population involve 
uncertainty. 

A statistical analysis may be thought of as 
asking questions of the data, in an investi¬ 
gation that compares two groups for the 
mean value of. for example, blood pressure 
or the prevalence of some disease, three 
questions may be posed: Is there a difference 
between the groups?; How large ts the 
difference?; and How accurately is the size 
of the difference known?. 

As expressed, ihe first question expects the 
answer “yes*' or “no"; although the answer 
cannot be given in precisely these terms, it 
is often reduced to two possibilities. The 
appropriate methodology is the Significance 
test. The second question expeas a numerical 
value to be the answer. This is an estimate 
and. as it is a single value, is referred to as 
a point estimate. In effect, the third question 
asks how reliable this point estimate is; the 
answer is a range of values which is referred 
to as an interval estimate or a confidence 
interval. 

These questions represent two approaches 
to inference: hypothesis testing and 
estimation. Although at first sight they 
appearro be quite different; in concept they 
have much in common. Both make 
inferential statements about the value of a 
parameter. (A parameter is an unknown 
quantity which partly or wholly characterizes 
a population, for example, a mean or a 
measure of association.) 

The significance test is an appropriate 
technique when there is an a priori hypothesis 
to test. For the purpose of the statistical text 
this hypothesis is expressed in null form — 
such as when no difference exists between 
groups — and the test evaluates whether the 


data are consistent with the null hypothesis. 
If the data differ markedly from those which 
would be expected under the null hypothesis, 
to the extent that ike probability of such an 
extreme result is low. then it is said that the 
result is statistically significant. Probability 
is measured on a continuum between 0 and 
1; but in significance testing a probability is 
considered low if it is less than conventional 
values such as 0.05 (5%) or 0.01 (I*). A 
significant result is equated with the rejection 
of the null hypothesis or the claim of a real 
effect. By definition, when the null 
hypothesis is true, significant results will 
occur by chance with the same relative 
frequency as the significance probability. 
That is. real effects will be claimed when the 
null Hypothesis is true; however, the proba¬ 
bility of this error (type 1) is determined in 
the data analysis. 

One disadvantage of a significance test is 
that it may fail to detect a real effect; that 
is. although the null hypothesis is false, the 
evidence ts not strong enough to reject it. The 
probability of this enor (type II) can be 
controlled at the design stage only, by 
appropriate selection of the samplfc size, and 
may be quite large. Thus, the trap of 
equating non-significance with no effect 
must be avoided; failure to reject the null 
hypothesis is not the same as accepting it. 

In the approach of confidence interval 
estimation no particular hypothesis is consi¬ 
dered; rather, the emphasis is on estimating 
those values of the parameter with which the 
data are consistent. These valiies form a 
range — the confidence interval. The range 
is calculated so that there is a high proba¬ 
bility — conventionally 95*?i or99*t — that 
it contains the true value of the parameter. 

A significance test is essentially a test of 
whether the data are consistent with a 
specified parameter value, and the confi¬ 
dence interval contains those parameter 
values with which the data are consistent. 
Therefore, a 5^ significance test anda95^t 
confidence interval contain some infor¬ 
mation in common: significance implies that 
the null hypothesis value is outside the confi¬ 
dence interval; non-significance implies that 
the null hypothesis value is within the confi¬ 
dence interval. However, the confidence 
interval contains more information because 
it ts equivalent to performing a significance 
test for all values of (he parameter, not just 
a single value. A confidence interval enables 
a reader to see how large the effect may be, 
not simply whether it is different from zero. 

The limitations of the interpretations that 
are provided by a significance test may now 
be considered. 

The difference ts significant. Thu means 
that there is a difference or, in other words, 
the size of the difference it not zero. We 
know no more than this. The difference may 
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while Freimcn et al. noted thai “negatm” 
trials were often too small to coossute a fair 


be Large and of great importance or it may 
be small and of no practical importance. It 
is unsatisfactory that the test provides no way 
of distinguishing between these quite 
different possibilities. 

The difference ii not stgnificanL This 
means that there is insufficient evidence to 
enable us to conclude that there is a 
difference. So the difference may well be 
zero. But this is not the same as saying that 
it is zero. The true difference may be quite 
large. Again, it is unsatisfactory that this 
possibility is uoi addressed. 

The conclusions that may be drawn from 
a significance test are considered to be 
incomplete because it is rarely that one is 
interested solely in whether a null hypothesis 
is or is not true; indeed in many cases it may 
be recognized at the outset that the null 
hypothesis is unlikely to be true. Rather, the 
question is how large is the difference and 
is it possibly large enough to be important? 
The emphasis is on measuring rather than on 
testing. The addition of the concept of an 
important difference to that of a null 
hypothesis means that there are four possible 
interpretations to an analysis: fa) the 
difference is significant and large enough to 
be of practical importance:#*; the difference 
is significant but too small to be of practical 
importance; (c) the difference is not 
significant but may be large enough to be 
important: and fd) the difference is not 
significant and also not large enough to be 
of practical importance. 

DMI*r#nc« f 


The size of difference that is considered 
to be large enough to be important is a 
matter for debate, and genuine differences 
of opinion may arise. It is a medical, not a 
statistical, question, although a medical 
statistician who is experienced in the subject 
area could contribute to setting a value. The 
fact that agreement on a unique value may 
be impossible in no way detracts from the 
argument. In fact, expressing the results as 
a confidence interval enables interpretations 
to be made for any particular value that is 
considered appropriate. 

These possibilities are illustrated in the 
Figure where the confidence intervals are 
shown. The significant and nonsignificant 
cases are distinguished by the confidence 
intervals that exclude or include zero respec¬ 
tively. The main point is that in each case 
the confidence interval i gives the range of 
possible values for the true difference. Of 
particular concern is fc). Here there may be 
no true difference or there may be a large, 
important difference. In other words the 
study is completely inconclusive. Such a 
possibility is missed by the simple expression 
“not significant” with its lure of equating 
this falsely with “no effect". This situation 
will arise with a study that is carried out on 
too small a sample and this is why good study 
design demands attention to sample size to 
try to prevent the occurrence of an incon¬ 
clusive result. Altman found that it was 
common for undue emphasis to be placed on 
“negative” findings from small studies , 1 


test of therapies.' Similarly, a significance 
lest will contrast (b) as significant and fd) as 
doc significant but fads to rec ognize dm they 
give essentially the same coo da sk m — dm 
any difference is too small to be important. 

As an example, consider scene results 
which were obtained by G amw ay ct aL from 
a dinicaJ trial for the management of acme 
stroke in the elderly.* Of 155 patterns who 
were managed in a stroke unit. 71 were 
assessed as independent when they were 
discharged from the unit compared with 49 
of 152 who were managed in a medical unit. 
The simplest analysis shows that the 
difference be tw ee n the su cc ess rates of the 
two units is significant at the 1% level. 
Therefore, a genuine effect has been estab¬ 
lished. To appreciate the importance of this 
effect the advantage of the stroke unit may 
be measured by the difference bet w een the 
two units in the percentage of subjects 
who were discharged as independent: 
50.3% - 32.2% - )M%. This is the point 
estimate. The accuracy of this estimate is 
given by its standard error (5.5) and the 95% 
confidence limitt (7.3% and 21.9%). Thus, 
the gain could be as large as 29% or as small 
as 7%. 

Recently, Gardner and Ahman have 
argued against the excessive use of hypothesis 
testing and urged a greater use of confidence 
intervals. 1 In an appendix to their paper they 
give methods to calculate confidence 
intervals for the commonly occurring two- 
sampk comparisons. 

In presenting the main results of a study 
it is good practice to provide confidence 
intervals rather than to restrict the analysis 
to significance tests. Only by so doing can 
authors give readers sufficient Information 
for a proper conclusion to be drawn; 
otherwise readers have to rely upon the 
authors* own interpretation.* Therefore, 
intending authors are urged to express their 
main conclusions in confidence interval form 
(possibly with the addition of a significance 
test, although strictly that would provide no 
extra information). One of the aims of the 
Journal's statistical review process will be to 
ensure that where possible this is done. 

CEOFTREY BERRY 

Associate Professor of Eboctausucs 
School of Public Health and TroptcaJ Med>cinc 
The University of Sydney 

1 Hr sir MJR. Ii ttstauci a taaxrt* J X Statist Sae A 
1971 14] 315)93 

2 Ah ms* DC. Statistics to medical Journals Um Mrd 
I9€2. 1: 59-71. 

J. Frnman J A. ChiKun TXT, Smith H Jr. JCveWer U. 
The importance of beta, the typ* 11 error mad maple 
tut vi the fleu** and murprtutiao of the randomised 
control trial N L*f< J hM 1971; 299 990-994 

4 GaniwiyWM. Akhiar AJ. Prescott * J. Hockey L 
Mastftfcntem of acme *ro4e tn the dderty prrfauunary 
rnvNi of i control led trial Br Med J 19*0; 290; 
1040-1043 

5 Gardner MJ. Altman DC Confidence tniervah rather 
than P *ah*cv estimator rather than bypotbou 
total* Br J 1994 . 29 2 744-7JO 



SIGNIFICANT 


NOT SIGNIFICANT 


Important Not important inconctuaivt Trua nagaiiva 

raaull 

FIGURE: Confidence interval *ho%\tng tour fgpsiibie conclunons tn term s of itaiisucil signtftc»nce 
and prjcual mxportince 
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ENVIRONMENTAL TOBACCO SMOKE 


') 


Scientific Method 


Scientific inquiry within an epidemiologic study begins 
with framing what is called a "null-hypothesis." The null- 
hypothesis states, in this instance, that ETS is not associated 
with a given disease state (e.g., lung cancer, heart disease, etc.). 
Data are then collected and analyzed in order to test i.e., reject 
or accept, the hypothesis. 

One method which is used to assess the relationship of 
the collected data to a given hypothesis is the test for statistical 
) significance.* Simply put, if the data examined yield a 

statistically significant result (here, the relationship between 
ETS exposure and a disease state), then the scientist is permitted, 
on the basis of those data , to reject the null-hypothesis. If the 
statistical test is not significant, then the data do not support 
rejection of the null hypothesis. 


* By convention, a ’p* (probability) value less than 0.05 is 
deemed statistically significant. A ' p* value less than 0.05 
means that the observed results would occur by chance less 
than 5 times out of 100. 

"Confidence limits" are the values between which the risk 
value can be expected to fall 95% of the time based on the 
variability of the underlying data. When the 95% confidence 
limits are both greater and less than 1.00, the risk value is 
considered not statistically significant, i.e., the results 
are likely to be due to chance and do not support a judgment 
regarding an association between exposure and disease. 
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There is no "absolute proof" involved, and there is 
nothing immutable about the concept of significance testing. 
Statistical significance is, after all, a convention. But the 
concept is illustrative, especially in the case of the association 
between ETS exposure and lung cancer. To date, there have been 28 
published reports on ETS and lung cancer, and only five have 
achieved statistical significance. It is clear that the 
preponderance of data do not permit rejection of the null 
hypothesis, i.e., there is no association between ETS exposures 
and lung cancer. In addition, virtually all of the individual 
risks reported in such studies are less than 2, which, to the 
epidemiologist, suggests a "weak" association which is probably 
the result of bias or confounding of factors unrelated to ETS. 


Inadequacies of ETS Studies 


Epidemiologic studies are notoriously unreliable 
in outcome. An observed relative risk of less 
than 1.5-2.0 (some would up to 3.0) is 
inadequate to reject the hypothesis of no 
effect. The overall relative risk calculated 
across studies is well below a minimal value 
for seriously attributing it to the presence 
of a real effect, i.e., it is within the range 
easily due to the "noise" in epidemiologic 
data resulting from the limitations and vagaries 
intrinsic to the methodology and its 
application. This same conclusion also applies 
to nearly all of the studies on an individual 
basis. Another reason for conservative 
interpretation of the ETS studies is that 
several studies are of poor quality (good 
textbook examples of how not to do an 
epidemiologic study) and some were originally 
designed for a different, or broader, purpose 
than assessing health risks from ETS exposure. 
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Sources of bias are present to varying degrees 
in most of the studies. Lung cancer patients 
may tend to overstate their exposure to spousal 
smoking as an explanation for their illness. 

Bias may result from depending on memory recall 
of a subject's exposure to spousal smoking. 

Estimates of relative risk may differ markedly 
between data collected from the subjects and 
data obtained from a surrogate, such as their 
children. Histologic verification of lung 
cancer was not conducted in all studies and 
the error rate may be substantial, e.g., 13% 

of the lung cancer cases in the case-control 
study of Garfinkel et al. were found to be 
incorrectly diagnosed when the histology was 
reviewed by one of the authors. (From: 

Summary of Public Docket Comments, Draft Risk 
Assessment, U.S. EPA, Dec. 1990.) 

Peter Lee, a statistician and epidemiologist from the 

United Kingdom, has argued that the increased risks reported in 

various epidemiologic studies are the result of an inherent bias 

in study design rather than the result of any genuine effect from 
1—5 

exposure to ETS. Lee presents data which indicate that the re¬ 

ported risks cannot be explained on the basis of either ETS expo¬ 
sure or dose for the nonsmoker. It is Lee's contention that the 
reported "risks" are the result of bias caused by a small number 
of smokers who are misreported in the studies as nonsmokers. 

Other kinds of misclassification may contribute to the 
repotted increase in lung cancer risks among nonsmokers, according 
to several scientists. For example, none of the studies on ETS 
and lung cancer provides direct observational information on ETS 
exposures. Instead, spouses, next-of-kin or friends are asked to 
estimate the amount of ETS to which they think the subject was 
exposed. Such estimates may lead to a kind of misclassification. 
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called exposure misclassification, 8 which has been shown by 

Garfinkel 7 , Friedman 8 and others 9-12 to lead to improper indices of 

exposure and incorrect estimations of risk. In Garfinkel's study, 

for example, relative risks varied from 0.83 and 0.77 when the 

women with lung cancer or the husband was the respondent, to a 

13 

risk of 3.57 when a son or daughter responded. That means that 

the reported risk for lung cancer in the women exposed to ETS was 

less than for women not exposed when either the women's or their 

husband's estimates were used. 

Dr. S. James Kilpatrick, a biostatistician from the 

Medical College of Virginia, has analyzed another form of misclassi- 

fication, called differential misclassification, which results 

"from the tendency of respondents to inflate the amount of ETS 

exposure for lung cancer cases and deflate the report of exposure 

for controls.” 8 Similarly, Dr. Ernst Wynder, President of the 

American Health Foundation, notes that "relatives of a nonsmoking 

lung cancer patient are more likely to report passive inhalation 

exposure on the part of their relative than are relatives of a 
. 14 

control patient." 

A more subtle form of potential bias is known as 

"publication bias", which stems from the apparent failure by 

journals to publish studies which report negative or weakly positive 
15 16 

results. ' Scientists have recently expressed concern over the 
growing trend among such journals to overemphasize (and hence to 
publish) only those studies which report positive increases in 
risk. ' Published studies which are combined for meta-analyses 
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therefore may not truly represent all investigations on the issue 

of ETS exposures and lung cancer. 

Host of the epidemiological studies on ETS and lung 

cancer have failed to consider age differences, diet, occupation 

and exposures to indoor or outdoor pollution as potential 

confounding elements. The importance of such factors is 

. 19- 

underscored by recently published reports from Japan and China. 

24 The reports suggest that indoor pollution generated by kero¬ 
sene heaters, coal stoves, liquified petroleum gas and exposures 
to cooking oil vapors may be responsible for the increased risk of 
lung cancer among Oriental women. Moreover, in 1989, researchers 
in the U.S. reported that nonsmokers living with smokers consumed 
less carotene (Vitamin A) than did nonsmokers who lived with other 
nonsmokers. They concluded that "dietary beta-carotene intake is 

a potential confounder and should be measured whenever possible in 

. . 25 

studies of the relation between passive smoking and lung cancer." 

Dr. Karl Uberla of Germany recently explained why any 

attempts to generalize about the significance of reported results 

of epidemiological studies on ETS and nonsmoker lung cancer will 

likely remain unconvincing, due to scientific deficiencies in each 
2 6 

of the studies. He wrote: 

The majority of criteria for a causal 
connection are not fulfilled. There is no 
consistency, there is a weak association, there 
is no specificity, the dose-effect relation 
can be viewed controversially, bias and 
confounding are not adequately excluded, there 
is no intervention study, significance is only 
present under special conditions and the 
biologic plausibility can be judged 
controversially. 
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Given these difficulties in interpretation, it is 
therefore not surprising that an eminent statistician should 
conclude that "it is unlikely that any epidemiological study has 
been, or can be, conducted which could permit establishing that 
the risk of lung cancer has been raised by passive smoking. 

Whether or not the risk is raised remains to be taken as a matter 

. . . 15 
of faith according to one's choice." 

Thus, proponents of the ETS health issue are confronted 1 
with weak associations and generally statistically nonsignificant 
risks in epidemiological studies on ETS. They are nevertheless 
forced to posit a causal mechanism for their theoretical model 
regarding health risks. They find no support in data from the 
actual exposure studies on ETS which suggest that an average 
nonsmoker is exposed, for example, to the nicotine equivalent of 
one one-hundredth to one one-thousandth (or less) of a single 
cigarette per hour. Such exposure data suggest that there is no 
conclusive biological plausibility to the ETS health claim, and 
that the reported risks in epidemiological studies may be 
artefactual, and probably due to bias and unconsidered confounders. 
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Environmental Tobacco Smoke and Long Cancer: 
A Critical Assessment* 

E.L. Wynder and G.CKabat 


Summary 

The possibility that exposure to environmental tobacco smoke (ETS) may increase the 
lung cancer risk of nonsmokers has become a cause of public concern. It is unknown 
whether the levels of carcinogens in the diluted sidestream smoke of tobacco products 
that reach the nonsmoker’s lung are sufficient to induce cancer. Available epidemiologic 
studies suggest a slight increase in the relative risk of lung cancer in nonsmokers due to 
exposure to ETS created by a smokiDg spouse. However, not all studies have found a 
significant association. The epidemiologic studies are examined in the light of the criteria 
of judgment of causality, including strength of association, consistency-, temporality, 
methodological issues, and biological plausibility. Suggestions for further research, 
including studies in high-exposure populations and greater attention to histology-, are 
proposed. 

Introduction 

Epidemiologists, chemists, biologists, physiologists, physicians, and public health 
officials have given much attention to the association of environmental tobacco smoke 
(ETS) exposure and the development of lung cancer in nonsmokens. A biological basis 
for such an association clearly exists because smoke constituents demonstrated to be ' ^ 
carcinogenic in laboratory animals are inhaled and retained by the nonsmokcr^ 
Metabolites of tobacco-specific smoke constituents have been identified in the saliva, 
blood, and urine of nonsmokers after exposure to ETS (Greenberg et al. 1984; Hoffmann 
et al. 1984; National Academy of Sciences 1986; USDHHS 1987; Sepkovic et al. 1988). 
Several epidemiological studies have found a positive association between ETS exposure 
- usually defined as being due to b smoking spouse - and lung cancer (Hirayama 1981'; 
Trichopoulos et al. 1981; Correa el al. 1983; SaDdler et al. 1985; Garfmkel et al. 1985; 
Akibaetal. 1986; Dalager et al. 1986; Pershagenet al. 1987). Other studies have found no 
significant association (Garfmkel 1981>; Chan and Fung 1982; Koo et al. 1983; Kabat and 
Wynder 1984; Wu et al. 1985; Lee et al. 1986). No consistent association has been 
reported for lung cancer and exposure to ETS in childhoodj which might be expected to 
exert a greater effect, especially when followed by exposure throughout adulthood. Of 
course, recall of ETS exposure in childhood is more difficult than recall of such exposure 
in adulthood. 


* Research described herein was performed under USPHS, National Cancer Institute Program 
Project Grant CA-32617: 


H. Kaluga (Ed.) Indoor Air Quality 
© Springer-Vcrlag, Berlin Heidelberg 1990 
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6 E. L. Wynder and G. G. Rabat 

The epidemiological study of weak associations is burdened with problems that may 
yield artifactual positive findings or may show negative findings where a real association 
exists. The association of ETS and lung cancer risk, even if weak, would still be of concern 
as a public health problem in that most people are at one time or another exposed to 
smoke from burning tobacco products and the exhaled pollutants of tobacco smokers. A 
weak association in epidemiology requires careful examination and an understanding of 
the variables in question and all of the factors influencing the association (Wynder 1987) . 

In this overview we critically examine the published studies on ETS exposure and lung 
cancer to determine whether the evidence presented to date permits a sound conclusion as 
to causation. 


General Exposure to ETS 

At the outset we need to emphasize that an association between ETS and lung cancer 
must be deemed possible. A recent survey of self-reported exposure in a hospitalized 
population revealed that 66% of men and 60% of women had ETS exposure in 
childhood; 32 % of the men and 61 % of the women reported ETS exposure in the home in 
adulthood; and 60% of the men and 62% of the women who worked outside the home 
reported ETS exposure at work (Rabat and Wynder, unpublished data, 1987). 


Critical Assessment 

The first Surgeon-General's Report on Smoking and Health, published in 1964 (USPHS 
1964); dearly delineated the criteria of judgment for causality. These criteria included: 
the magnitude of the association, consistency, temporality, and biological plausibility. 
Since these criteria were considered necessary to prove causation for a strong association, 
namely, active smoking and lung cancer, they should be equally required to determine the 
causality of weak associations (Wynder 1987). Let us examine the epidemiological 
evidence Unking ETS with lung cancer in respect to these criteria. 


Strength of the Association 

An association is generally considered weak if the odds ratio is under 3.0 and particularly 
when it is under 2.0, as is the case in the relationship of ETS and ilting cancer (Table 1). If 
the observed relative risk is small,it is important to determine whether the effect could be 
due to biased selection of subjects, confounding, biased reporting, or anomalies of 
particular subgroups. 

Consistency 

If an association is real, internal consistency should be apparent within andibetween 
different studies. The majority, but not all I of the studies of ETS and lung cancer have 
shown a positive association for ETS-exposure due to a smoking spouse (Table 1). In 
most of the studies, the confidence interval includes 1.0. While the prospective study by 
Hirayama (1981a) among Japanese women showed a significant association with the 
husband's smoking (largely adenocarcinomas), the prospective study among American 


1 
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Table 1. Summary of result* of studies relating lung cancer risk in married women to their 
husbands’ smoking habits 



Relauve nsk 

95% Confidence interval 

Prospective studies 

Hirayama (1981) 

1.63 

1.25-2. HI 

Garfinkel (1981) 

1.18 

0.90-1.54 

Case-control studies 

Trichopoulos et al. (1981) 

2.1 

1.18-3.78 

Chan & Fung (1982) 

0.75 

0.44-11.30 

Coraet alJ (J983) 

2.03 

0.83-5.03 

Roo et al. (1983) 

1.54 

0.90-2.64 

Rabat & Wynder (1984) 

0,79 

0.26-2 43 

Wu et al. (1985) 

1.2 

0.6 -2.5 

Garfinkel et al. (1985) 

1.12 

0.74-1.69 

Lee et al. (1985) 

1.03 

0.41-2.47 

Akiba et al. (1986) 

1.48 

0.88-2.50 

Pershagen et al. (1987): 

1.28 

0.75-2.16 


Tabk 2. Distribution of lung cancer by histologic groups in smokers and never-smokers. (From 
Rabat and Wynder 1984) 


Smokers 


Never-smokers 

Males 

Females 

Males 

Females 

(N = 1882) 

(N = 652) 

3 

M 

i>j 

C* 

3 

ii 

>o 

[%] 

[%J 

r%i 

r%] 


Kreyberg I 

63 

52 

35 

21 

Kreyberg II 

32 

43 

54 

74 

Mixed and undifferentiated/anaplastic 

5 

5 

11 

5 


women by Garfinkel (1981) did not. It has been suggested that Japanese and American' 
women are exposed to different levels of ETS due to different conditions in the two> 
countries. Such differences could account for this disparity (Hirayama 1981b). 

Within those studies presenting specific histologic analysis, differences exist in 
respect to the type of lung cancer involved. In active smokers, tobacco smoke exposure 
has a causative effect predominantly on squamous and small cell! types of lung cancer 
(Kreyberg I), with a lesser, though still significant causative effect on the glandular type 
(Kreyberg II) (Wynder and Stellman 1977). Among nonsmokers, however, the glandu¬ 
lar type of lung cancer predominates among both men and women (Rabat and Wynder 
1984) (Table 2). The effect of ETS would thus be expected to be primarily responsible 
for the higher rate of adenocarcinomas among nonsmokers. The studies by Dalager 
et al. (1986) and Pershagen et al. (1987), however, suggest that the effect of ETS 
exposure is limited to induction of squamous cell lung cancer (Table 3). If this were, in 
fart, the case, then only the squamous or small cell type of lung cancer in'nonsmokers 
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would be affected by ETS, Clearly, it is important that investigations of the effect of 
ETS exposure on lung cancer development in nonsmokers take histology into account, 
so as to determine whether an effect of ETS is limited to certain histological types. 

Since smoking is more prevalent in lower income groups, at least among men, 
lung cancer in nonsmoking women in these groups Should have a higher incidence. 
Thus, the influence of the level of education on smoking habits in the examined 
population needs to be considered as a possible confounds. Few studies to dhte 
have done this. 


Methodological Issues 

A particular concern in weak associations is reporting bias, that is, potentially 
differential reporting of exposures between cases and controls. In terms of ETS, does the 
lung cancer patient report exposure to tobacco smoke, be it at work, at home, at social! 
functions, in childhood or adulthood, differently than the control? The case is likely to 
have a different attitude toward this question than does the control, a handicap not 
applicable to prospective studies. It needs to be determined whether the case’s attitude 
towards questions on ETS exposure leads to undcr^ or oveireporting. Cases are likely to 
underreport their own smoking (Lee 1987); and they may tend to overreport their 
exposure to ETS and other potential hazards that could account for their illfaess. In 
studies that use proxy reports, different relatives mayrespond differently. Garfinkel etal, 
(1985) provides some insight into this phenomenon by showing that if the response came 
from the patient, the odds ratio was 1.0, if from the husband it was 0.92, and if from the 
daughter or son, 3.19 (Table 4). More work is needed on the validity of ETS-exposure 
information obtained from different relatives before we can evaluate which of these 
relhtive risks is closer to the truth. 

In general, possible reporting bias represents a serious problem in case-control studies 
because it can produce a systematic artefact. It is particularly worrisome in that it cannot 
be effectively measured. 

Wc also need to consider mi sclassifi cation that can occur in both retrospective andi 
prospective studies. Lee has proposed (Lee et all 1986; Lee 1987) that the reported ETS 
effect on lung cancer risk can be explained by a misclassification of smokers as 
nonsmokers. According to these studies, a substantial percentage of respondents 
misrepresent their smoking habits. Using a 10i0% mi sclassifi cation rate of ex-smokers as 
self-reported neversmokers coupled with the concordance of spouses’ smoking habits, 
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Table 4. Data from Gaifmkel et al (1985) by type of respondent 


Husband’* smoking habits at home 


N of cases OR 

95% C.I. 


Self 

16 

1.00 

0.55- 1.74 

Husband 

34 

0.92 

0.63- 1.34 

Daughter/aon 

48 

3.19 

0.91-11.19 

Other 

36 

0.77 

0.57- 1.03 


60 0 
50 O 
40 0 
30 0 
200 
2 10 0 

c 00 

« 16.0 

o 

o 

o 12.0 

eo 

4 0 

0 0 



YEARS SWCEOUrm* 


Fig; 1. Odds ratio of male ex-smokers for Kreyberg I (N = 687) and Kreyberg If (N '= 301) lung 
cancer by years since quitting (controls = 6334). Source: American Health Foundation data 


Lee calculated that an apparent increase in lung cancer risk can be obtained among 
nonsmokers married to smokers that approximates the increased risk observed in a 
number of epidemiologic studies (Lee 1987). At the extreme, Gaifmkel et al. (1985) 
showed that 40% of lung cancer cases classified as "nonsmokers" in the hospital chart 
were in fact smokers as determined by interview. Although such a high rate of 
misclassification does not occur when cases are interviewed personally, to some extent 
denial is likely to occur even: then, particularly among ex-smokers who had stopped 
smoking ten or more years ago. The risk of lung cancer among long-term ex-smokers, 
and even among ex-smokers who quit more than 16 years earlier, does remain elevated 
above the rate among those who never smoked (Fig. 1). Denial of past smoking may also 
not be uncommon in populations where smoking is or was socially unacceptable, as is the 
case among older Japanese women. 
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Table 5. Percent of lung cancer cases wbo never smoked by histologic group (A.H.F. data) 



Males 




Females 




KI* 


k no 


KI* 


Kir** 


r%] 

N 

t%] 

N 

r%] 

N 

[%]' 

N 

1969-1973 

1.2 

488 

5.6 

142 

10j7 

103 

23.7 

76 

1974-1976 

1.6 

887 

3.0 

305 

16.4 

263 

25.3 

146 

1977*^1980 

2.1 

628 

4.6 

390 

5.6 

231 

22.0 

245 

1983-1985 

1.4 

725 

3.6 

463 

6.8 

311 

16 6 

284 


• Kreyberg! 
•• Kreyberg II 


Another problfcm for epidemiologists involves subgroup analysis (Stallones 1987), 
Investigators are likely to examine numerous subgroups, and then prefer to present those 
subgroups that best fit the hypothesis. This tendency’ represents an inherent problem in 
epidemiology. The investigator should at a minimum give aD idea of how many 
subgroups were originally examined and how many subgroups were discarded! 


Temporality 

One of the factors that led to the conclusion that active smoking causes lung cancer was 
that the increase in cigarette consumption preceded the increase in lung cancer rates, first 
in men and later in women. Enstrom (1979) has reported an increase in the lung cancer 
rate in nonsmokers over recent years, suggesting that factors in addition to personal! 
cigarette smoking influence lung cancer mortality rates. The groups examined, however, 
are not strictly comparable, and misclassification of smokers as nonsmokers in the 
national surveys needs to be considered. Our data from a long-term, hospitahbased case- 
control study do not indicate an increase in the percentage of male nonsmokers with lung 
cancer in either of the two main histologic groupings (Kreyberg I and 11|.over the last 30 
years (Table 5). 

In fact, the percentage of nonsmokers with lung cancer among women has declined, 
which may be a consequence of the diminishing pool of women who have never smoked. 


Biological Plausibility 

Several studies have demonstrated that most tumorigenic agents are present in undiluted 
sidestream smoke in higher concentrations than in mainstream smoke (Hoffmann et al. 
1983; National Academy of Sciences 1986; Hoffmann and Wynder 1986) (Table 6), 
Biochemical studies indicate that nonsmokers exposed to ETS have levels of nicotine or 
cotinine in the blood or urine that are about 1/100th the level seen in active smokers 
(Table 7) (Jarvis et al. 1984; National Academy of Sciences 1986). Some of the nicotine 
measured in the blood and urine represents nicotine that is absorbed by the saliva of 
nonsmokers and does not reach the lung directly (Jarczyk et al. 1987), It is important to 
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Tabled. Distribution of compounds in undiluted cigarette mainstream smoke (MS) and sidestream 
smoke (SS) 


Ndnfilter cigarettes 



MS 


SS/M5 

(A) Vapor phase 

Carbon monoxide 

cn 

CN 

( 

O 

mg 

2.5- 

4.7 

Carbon dioxide 

20-40 

mg 

8 - 

If 

Benzene 

20-50 

ug 


10 

Formaldehyde 

5 - 100 


Q.1--50 

Acrolein 

50-100 

US 

8 - 

15 

Acetone 

100 - 250 

Pg 

2 - 

5 

Hydrogen cyanide 

400 - 500 


0.1- 

0.25 

Hydrazine 

24-43 

ng 

3.0 

170 

Ammonia 

50 - 170 

Pg 

40 - 

Metbylamine 

11.5 - 28 

7 PB 

4.2- 

6.4 

Nitrogen oxides 

50 - 600 

W 

4 - 

10 

N -mtrosodimethylamine 

10 - 180 

ng 

20 - 

100 

N-mtrosopyrrolidme 

2 - HO 

ng 

6 - 

30 

(B) Particulate phase 

Particulate matter 

15 - 40 

mg 

1.3- 

1.9 

Nicotine 

1 - 2.5 mg 

2.6- 

3.3 

Phenol i 

60 - 140 

Hg 

1.6- 

3.0 

Catechol 

100 - 350 

Ug 

0.6- 

0:9 

Hydroquinone 

no - 300 

Ug 

0.7- 

0:9 

Aniline 

360 

ng 


30 

2-Toluidine 

30-160 

ng 


19 

2-Naphthylamine 

4.3- 27 

ng 


30 

4-Aminobipbenyl 

2.4 - 4.6 ng 


31 

Benz(a)anthracene 

40-70 

ng 

2 - 

4 

Benzo(a)pyTtne 

10 - 40 

ng 

2.5- 

3,5 

N'-Nitrosonomicotine 

120 -3,700 

ng 

0.5- 

3 

NNK 

120 - 950 

ng 

1 - 

4 

Cadmium 

100 

ng 


7.2 

Nickel 

20 -3,000 

ng 

13 - 

30 

Polonium*210 

0.03- 1.0 pCi 

7 





* 

>* 

t 
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a 


! 


i 

i 
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note that nicotine occurs in ETS primarily as a vapor phase constituent rather than in the 
particulate matter of the aerosol as is the case in mainstream cigarette smoke (Eudy et all 
1987): Measurement of nicotine or its metabolites will, therefore, not reflect' the 
proportional uptake of particulate matter from ETS. In the light of our present 
knowledge of dose-response in carcinogenesis and because the carcinogenic activity of 
tobacco smoke as measured in animal systems is relatively low, the question needs to be 
raised whether the carcinogenic potential of inhaled ETS suffices to induce lung cancer. 
Hoffmann and Hecht (1985) have proposed nicotine-derived nitrosamines in ETS as 
organ-spec iflc carcinogens for the lung. It is possible that these chemicals reach the lungs 
in sufficient dose to induce neoplastic changes. These carcinogens may also be formed 
endogenously from inhaled or ingested nicotine and appropriate nitrosating agents 
(Hoffmann and Hecht 1985). Tumor promoters are less likely to play a role in ETS 
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Table 7. Approximate relations of nicotine as a parameter between noD-«nokers, passive smokers 
and active smokers*. (From Jarvis et al. 1984) 


Nieotine/cotinine 

Non-smokers without 

ETS exposure 

No. = 46 

Nbn-imokers with 

ETS exposure 

NO. = 54 

Active 

smokers 

No. = 94 

Mean 

value 

% of active 

■rookers 

value 

Mean 

value 

% of active 

smokers 

value 

Mean 

value 

Nicotine (ngfml) 






in plasma 

1.0 

7 

0.8 

5:5 

14.8 

in saliva 

3.8 

0.6 

5.5 

0.8 

673 

in urine 

3.9 

0.2 

12.1 ^ 

0.7 

1,750 

Cotinihe (ngfml) 





275 

in plasma 

0.8 

0J3 

2.0* 

0.7 

275 

m saliva 

0.7, 

0.2 

2.5** 

0.8 

310 

in urine 

1.6 

0.1 

7:7'— 

0.6 

1,390 


• Differences between non-smokers exposed to ETS compared with non-smokers without 
exposure 

• p < 0.01 
** p< 0.001 

carcinogenesis than in active smoking because of their much lower concentration. In 
general, tumor promoters are effective only when applied repeatedly in relatively large 
amounts. 

In considering the existing data on ETS exposure and lung cancer, it is noteworthy 
that Auerbach et al. (1961) showed only minor histological changes in the bronchial 
epithelium of nonsmokers and found that the ciliated columnar epithelium that covers 
their bronchi were largely intact. Deposition of carcinogenic smoke particulates can take 
place only upon inhibition of the protective functioning of the lung clearance system. 
Squamous cell lung cancer cao arise only from ciliated columnar cells that have 
undergone squamous metaplasia. 

An active smoker with each puff from a cigarette inhales a volume of 35-50ml of a 
concentrated aerosol containing 3-5 billion particles per ml that adversely affect: the 
protective cilia and mucous defense system of the bronchi (Ferin et al. 1965). The passive 
smoker is at no time exposed with such force to such a highly polluted inhalant. 
Furthermore, ETS particles are more likely to be deposited in the upper respiratory tract 
and not predominantly in the bronchi as is the case in active smoking. Thus, our 
respiratory defense system may be able to deal more readily with the relatively lighter 
deposition of particles and exposure to volatiles in ETS* as the observation by Auerbach 
et all (1961) would suggest. 


Future Studies 

Future epidemiological studies on the association of ETS with lung cancer should 
attempt to avoid the pitfalls discussed above. The definitive evidence that a factor causes 
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human cancer requires support from descriptive, metabolic, and molecular epidemiolo 

gy- 

Beyond extension of prospective studies, such as those now in progress by Gaifinkel and 
Stellman at the American.Cancer Society, we suggest: 

1) Continuing ongoing case-control studies with special reference to histologic type and. 
careful consideration of methodological issues. 

2) Estimating the relative importance of ETS exposure in different settings - in the 
home, in the workplace, in social situations, and during transportation. 

3) Further studying lung cancer rates among pipe and cigar smokers, and, if feasible, 
among nonsmokers exposed to ETS from these products. 

4) Studying lung cancer incidence in groups occupationally exposed to high levels of 
ETS at their worksite such as waiters, bartenders, train conductors, airplane 
personnel, and office workers. 

5) Studying bronchial epithelium in autopsy material of established never-smokers 
whose exposure to ETS is known. 

6) Determining the incidence of lung cancer by histological type in confirmed never- 
smokers. 

7) Comparing the presence of adducts of tobacco-specific carcinogens with DNA in 
smokers, passive smokers, and “never-smokers" (Hoffmann andHecht 1985; Hecht et 
al. 1987)j 

In summary, verification of the possible association of ETS and lung cancer represents an 
important challenge to epidemiologists, laboratory scientists, and public health authori¬ 
ties. The public is entitled to inhale the cleanest possible air regardless of whether ETS is 
proven to be cancer-inducing^ Additional efforts on the part of epidemiologists are 
required to firmly, establish the nature and significance of the reported associations 
between passive smoking and lung cancer. 
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‘ What Is the Epidemiologic Evidence for a Passive Smoking- 

|| Lung Cancer Association? 

| * N. Mantel 


; 

; Summary 

, Two survey articles of reports on the association of passive smoking with lung cancer 

. have recently appeared, and also a comprehensive report on the subject of environmen- 

. tal tobacco smoke by a committee of the National Research Council of the United 

■ States. The observed excess over a relative risk of unity cannot be explained by chance. 

* Nor can it be fully accounted for by a particular source of bias, the false claims of being 

‘ non-smokers by individuals who were active or ex-smokers. That possible source of 

» bias leads, in one summary survey, to reducing a relative risk of 1.35 to 1.30, but from 

* 1.34 to hi5 in the National Research Council report. The latter report suggests that 

j statistical significance would no longer obtain; perhaps, particularly , because of other 

possible biases. However, to get an estimate of the correct relative risk due to passive 
‘ smoking, allowance has to be made for actual exposure to passive smoking of those not 

’ exposed at home. Thus, the 1.30 is adjusted upwards, by 18 in one survey, to 1.53, but 

by only 8% in the National Research Council report to L24. The National Research 
Council report had given an anticipated relative risk of 1.1 based on dosimetric 
considerations. But it is suggested here that that could be as low as 1.05, too low to be 
detected in an epidemiologic investigation - in any case it would 1 be based on 
[J hypothetical assumptions. 

i: In November of 1986 there were two near-simultaneous review articles addressing the 

subject of passive smoking and King cancer. One was an invited guest editorial by Blot 
11 and Fraumeni in the Journal of the National Cancer Institute, the other a contemporary 

theme discussion by Wald et al. in the British Medical Journal [1,2]. 

. There was substantial overlapping in the two articles of the various publications on 

♦: the subject, and on the basis of which the conclusion of a significant positive association, 

j was made. The article by Wald et al. gave, perhaps, more statistical detail 1 about the 

4 results of the several studies covered. But, to my mind; there was uncritical (acceptance of 

4 the results of all the studies. Blot and Fraumeni did suggest that there were some flaws in. 

' a particular study, that by Hirayama [3], but decided that any inherent biases in.that 

* investigation could not have given rise to the observed elevated risk. 

From their overall evaluation of 10 case-control studies (all 10 gave results for 
] females, five separately for males as well) and three prospective studies (two of these 

covered males separately), which provided 20 separate relative risk (actually odd* ratio) 
values, Wald et all came up . with a summary relative risk of lung cancer due to passive 
smoking of 1.35 (95% limits 1.19 to 1.54). They trim this down to 1.30 on the basis that' 
some of the presumed non-smokers exposed to passive smoking were actually smokers. 

, Then, on the added basis that even those unexposed to passive smoking at home may still 

I have been exposed when away from home, they raise their estimate of relative risk to 1.53. 

. But note that this last modification presupposes the answer, that passive smoking does 
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elevate the risk. For if it did not, there would be no basis for adjusting the 1.30 or 1.35 
upwards to 1.53. 

Blot and Fraumeni come up with a similar summary measure of relative risk for 
passive smoking of 1.3 (95% limits of lL 1-1.5), but elevated to 1.7 (95% limits of 1.4-2.1) 
for heavy passive smoking. These authors suggest that heavy passive smoking is 
equivalent, at least in terms of nicotine received, to smoking between l/2and 3 cigarettes 
daily , and estimate that smoking a few cigarettes daily would give rise to a relative risk of 
about 1.5-fold to twofold. 

While Blot and Fraumeni do not address the question of correct reporting of non¬ 
smoking status, Wald et al. do, having used this as a basis for lowering the relative risk 
estimate from 1.35 to 1.30. Based on reports and communications from others, Wald et 
al; estimate that persons reporting themselves as never having smoked (lifelong non- 
smokers) comprise 2.1 % active smokers plus 4.9% former smokers, for a total of 7 % ever 
smokers among the self-claimed never smokers. Wald et al. estimate that these 7 % have a 
combined relative risk of 2, making the assumption in doing this that the active smokers 
among the 7% smoked on average only a quarter as much as active smokers generally. 
The relative risk of 2 for the 7 % is computed as a weighted average of 3 for active 
smokers, 1.5 for former smokers, among the 7%. 

If 7% of reported never-smokers were actually ex-smokers or active smokers, which 
were they - the spouses, say, of smokers or the spouses of non-smokers? In my own 
critique of Hirayama, I had suggested that this false reporting of non-smoking status 
wouto preferentially be among those with smoking spouses [4]. If, for example, the 7 % 
overall misreporting of non-smoking status concentrated among spouses of smokers, it 
would be somewhat higher among persons with smoking spouses who, nevertheless, 
claimed to be never smokers. Suppose we take it at 20%, in which case the reported 
lifeltrog non-smokers relative risk would be 1.20. It could be substantially higher but for 
the assumption by Wald et al. that the active smokers among the reported never smokers 
had sharply reduced levels of smoking. However, Wald et al. were ready to make only a 
small reduction in relative risk for this factor, from 1.35 to 1.30. Their speculative 
increase, which might have no basis at all, was much greater, from 1.30 to 1.53. 

The effect of false reporting of smoking status, specifically of non-smoking, could be 
much sharper than what Wald et al. have suggested. In a study of biochemical markers of 
smoke absorption, Jarvis et al. branded as “deceivers" 21 individuals who claimed to be 
non-smokers [5]. These 21 displayed biochemical patterns very similar to those of actual 
smokers, not at all like those of accepted non-smokers. The 100 accepted non-smokers 
comprised 46 without passive smoking, 54 with. Those 21 would constitute 21/121 or 
about 17 % of the total, and these would be active smokers, not just former smokers, or 
eightfold greater than the 2.1 % Wald et al. postulated. Perhaps in the epidemiologic 
investigations made, false reporting of non-smoking status is at a much lower Ifcvel, but it 
would not take much false reporting to account fully for the seeming association between 
passive smoking and lung cancer. 

Recently, a colleague expressed to me the thought that if passive smoking played no 
role in lung cancer, why are we Dot finding many negative associations, nor any 
significantly negative associations? Actually, six of the 20 relative risks reported in Wald 
et al. are at 1.00 or smaller. And some of those reported as in excess of 1.00 conceal rates 
of under 1.00. Thus, relative to the rate shown of 1.23 for the study reported by Garfmkel 
et ai. K I have brought out in my own critique that that represented a composite of data for 
various classes of respondents [6* 7). Where the woman with lung cancer was herself the 
respondent (as to her husband's level of smoking) the relative risk was 0 j 83. Using the 
husbands’ responses, the relative risk was 0.77. It was only on the basis of responses by 
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the sons and daughters, at a time long past when they would have left home, that a 
relative risk of 3.57 emerged, sufficiently high to raise the overall estimate of relative risk 
to L23. As I indicated in my critique, the replies by the children were more accusatory in 
nature than revealing of any true relationship.' 

| But even so, it would take 40 large studies to get on average a single seemingly 

significant negative association of lung cancer with passive smoking, assuming statistical 
testing at the 59c, two-tailed, Ifevel. But we have only 20 evaluations, with many so small 
that they could not possibly yield any apparently significant protective effect, not even in 
the unrealistic situation that passive smoking was 100% protective. Suppose a study had 
a null expectation of only 2 or 3 passive smokers with lung cancer - then there would be 
some observed number, 5 or 6 or 7 or 8 or more which would be significantly in excess of 
expectation. But there would be no number, however small or even zeroj which would be 
significantly below expectation. Yet just such low expectations characterize several of the 
| studies reported on by Wald et al. In one study, a relative risk of 2.29 is shown based on 

, only 2 actual cases of lung cancer in passive smokers, expectation 1.20. Another relative 

. risk of 2.45 is based on 3 observed, L77 expected. For one prospective study, 4 observed 

cases have given rise to an estimated relhtive risk of 3.25, and in another 7 observed cases 

* gave rise to a relative risk of 2.25, suggestive of an expectation little in excess of 3. On the 

* other hand, the four reported risks of under 1.00 had expectations variously of 37.67, 
34.08, 6.64 and 13.77: 

| Of concern to Wald eta1. was whether the various relative risks were homogeneous. 

* On this point they cite a chi-square test for heterogeneity of 20.0 on 19 degrees of 
freedom, p > 0:2. However,,this is not so much evidence of homogeneity of relative risks 

‘ as it is reflective of the high unreliability of the individual relative risks. For 8 of the 20 

relative risks shown, the upper limit on the relative risk exceeds the lower limit by a factor 
of about 10 or more, that factor attaining a value of 57 in one instance. 

Blot and Fraumeni express concern about other long term consequences of passive 
smoking, particulhrly in connection with coronary artery disease. They cite a report by 
Garland et al. [8] who initially reported a relative risk due to passive smoking of death 
f from ischemic heart disease of 14.9, but seem unaware that the estimate of 14.9 has been 

, revised downward to 2.7. In the report of the National Research Council [9], which I will 

! be discussing below, there is awareness of the downward revision, but not of the fact that 

» the suggestive significance of p < 0.10 is lost and becomes p < 0 . 20 . 

That lung cancer may aggregate in families is also of concern to Blot and Fraumeni, 
; who cite Ooi et al. on the subject [ 10]. Elsewhere, and yet to appear, I have suggested that 

» apparent familial aggregation, in the instance breast cancer, may be a reflection of an 

I awareness bias rather than of true familial aggregation [11]. If information about 

1 relatives is not collbcted more directly, the apparent aggregation based on reports from 

: the Index case may only reflect heightened knowlfedge by such cases of similar illnesses 

' about relatives. But the report by Ooi et al. is another instance, like that of Garland et a)., 

' in which there has been unreliable statistical evaluation. Thus, Ooi et all initially reported 

that the lung cancer risk increased eighteen-fold per 10-year age increase. By lfetter in the 
October 1986 issue of the Journal of the National Cancer Institute they have revised that 
factor downwards, giving separate factors for each 10-year age interval. From age 50 to 
age 60, the factor is now reported at only 2.9. 
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The Report of the Committee on Passive Smoking, Board on Environmental • 

Studies and Toxicology, National Research Council [9] 

I have chosen to discuss the epidemiologic aspects of this Report separately, since it is 
essentially the definitive work on current knowledge on environmental tobacco smoke. A 
member of the committee was Nicholas Wald, senior author of one of the articles 
discussed above. The report contains a technical appendix which largely duplicates the 
appendix in the article by Wald et al. and also repeats, with minor variations, the data of 
Wald et al. The body of the report itself contains those same data, but recast differently, 
and it is the same 13 studies, with 20 relative risk values, which underlie the epidemiologic 
aspects of the Committee Report. 

There are a great variety of issues which the Committee Report goes into, whether 
physiochemistry, toxicology, assessment of exposures, use of questionnaires, exposure- 
dose relationships, etc. But my concern at this time is the epidemiology. There could be a , 

point to estimating the annual number of lung cancer deaths in the United States due to 
passive smoking, but that would have to be on the presumption that passive smoking 
does play a causative role. 

However, the Committee Report is quite restrained in its findings and leaves open the 
question of whether anything has been established. If the apparent relative risk is 
significantly greater than unity, the excess cannot be fully explained away by certain 
biases considered. However, whether there is statistical significance in view of those 
biases is not addressed., 

From dosimetric considerations, the Report suggests that the excess risk of lung 
cancer due to environmental tobacco smoke should be 1 % of the excess risk dUe to active 
smoking. This leads to a relative risk of 1.14 for men, perhaps less for women. From the 
epidemiologic data, the summary relative risk is 1.34, but it is brought out that for United 
States studies only the relative risk would be only 1.14. If only large studies are 
considered, the overall relative risk would be 1.32. 

Next addressed by the Report is the effect of biases, particularly the bias associated 
with the false reporting of individuals that they were not (or never have been) smokers. 

This leads to a lowering of the estimated relative risk of 1.34 (or 1.30 to 1.34) to 1.15. But 
note that on this same basis, Wald et al. were willing to reduce an apparent relative risk of 
1.35 only slightly, to 1.30. , 

Yet another adjustment is made. If nomsmokers are not exposed to environmental 
tobacco smoke at home, they might still be exposed to it away from home. An upward 
adjustment of 8% on account of this yields 1115 X 1.08 = 1.24. This contrasts with the 
upward adjustment of 18% made by Wald et all, who calculated li.30 X li. 18— 1.53* The * 

Committee Report differs markedly from the separate report made by one of its own 
members. 

In discussing Wald et al. I suggested that the upward modification they have 
presupposed a positive role for passive smoking. This same thing is true for the 8% 
upward adjustment in the Committee Report. For purposes of evaluating the statistical 
significance of the findings, the relati ve risk should be taken as 1.15, though the value of 
1.24 might be appropriate for assessing the toll 1 in excess lung cancer due to passive 
smoking assuming that there is causality. With the United States studies indicating an 
unadjusted relative risk of only 111 4 rather than 1.34, both the 1.15 and the 1.24 might be 
sharply lowered if intended to apply only to the United States. 

But let me stay with the relative risk of 1.15 prior to the 8 % upward adjustment. Is that 
relative risk significantly in excess of 1.00? I suspect not. And even the question of bias 
remains open. Both in the Committee Report and in the article by Wald et al., the only 
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biases factored in were just those that would fit into neat mathematical formulas. More 
subtle biases or ones that Had not been thought of did notget in. I gave an example above 
of the use by Garfmkel ct &1. of the responses by sons and daughters of the level of 
smoking by the fathers, 

I might even speculate about publishing bias. If an investigator got a weakly or. 
insignificantly negative result for the role of passive smoking in lung cancer, would he 
bother submitting it for publication? And if he did, would it be accepted for publication? 
Postulating this kind of bias is not necessary for establishing that the 1.15 relative risk is 
likely not significant. But I bring it up in connection with a tendency, I see towards 
accepting uncritically or less critically manuscripts which are on the right side of the fence 
on the issue of passive smoking. A particular example was the publication of the article by 
Garland et al. on passive smoking and ischemic heart'disease mortality, the claims of 
which fell apart on scrutiny. 

Let me bring up now another thought. Some time ago the possibility of subtle or not- 
so-subtle biases in case-control or other epidemiologic investigations was so much a 
matter of concern that it was suggested that unless the relative risk were at least 2.0, any 
increase in risk should not be accepted. Perhaps we can do better now and might employ a 
less restrictive criterion. 

But I can see no relaxation to the point of accepting the relative risks now observed for 
passive smoking ini lung cancer. What we must accept is that it is unlikely that any 
epidemiologic investigation has been on can be mounted which would establish a causa! 
role for passive smoking in lung cancer. Those who believe such a role exists should 
continue to believe as much, and might even hazard estimates as to the resulting toll in 
deaths and disease, with other allowed to hold contrary beliefs. What would be incorrect 
would be to claim that epidemiologic studies have established the correctness of the 
belief. 

If epidemiologic investigations cannot establish a role for passive smoking; the best 
we can do is to make suppositions estimates of how great that role may be - and such 
suppositions estimates can be too high if any of the underlying supposals are false. One 
supposal would be that the dosage response curve is linear through the origin, another 
that some particular biochemical measure, say level of cotinine, is a proper measure of 
the equivalent exposure to cigarettes of passive smoking; And, I point out, there could be 
the assumption that the temperature at which tobacco smoke is inhaled is Dot relevant, 
though I would think that fresh hot smoke would be more active than stale smoke. 

With this thought iD mind, we can pick up some cities from the report of Jarvis et al. 
who, after excluding “deceivers", report average cotinine levels in plasma, saliva, and 
urine of 100 non-smokers to be at 0.55%* 0.55% and 0.364 respectively of those levels 
from 94 smokers. Let us take it at 0.5%. If the average cigarette smoker has a relative risk 
for lung cancer of 10.0 (enhancement of 900%, though the enhancement may be 1,400% 
for very active smokers); this would put the enhanced risk due to environmental tobacco 
smoke at 4.5%, foT a relative risk of 1.045 (it would be 1.07 using the 1,400% 
enhancement for very active smokers). That relative risk, 1.045, would encompass both 
passive smoking at home and away from home, including individuals not exposed to 
passive smoking at home. 

What matters, however, relative to the conduct of epidemiologic studies on the 
subject, is the differential in relative risk between those knowingly exposed to passive 
smoking and those who believe themselves unexposed; From data available in Jarvis et 
al., it would appeal that those seemingly not exposed to passive smoke (46 in number) 
nevertheless have a relative risk of about 1 l 02. For the 54 non-smokers claimed to be 
actually exposed to passive smoking, the relative risk based on cotinine levels would, in 
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similar manner, be ll07. Compared then to seemingly non-exposed to passive smoking, 
the calculated relative risk for the known exposed to passive smoking would be 1.05. That 
small increase in relative risk just would not show up on any epidemiologic investigation 
and would be submerged, in any case, by other very likely biases. The National Research 
Council report had suggested a relative risk, based on dosimetric considerations, of 1.14, 
but on the assumption that enhancement in risk due to an active smoking was 1,400%. 
An enhancement of 900% would have led them to anticipated relative risk of 1.09. But 
whether we use 1.05, 1.09, or 1.14, the effect would still be undetectable. 

As a last point, I raise the issue of passive smoking effects on children. If parents can 
be shamed into not exposing their children to passive smoking, this is all well and good, 
even if the supporting basis is unsound. I note that the ill effects arise mostly in early 
childhood, and have two questions. Have the passive smoking effects been isolated from 
effects due to mother’s smoking prior to the child’s birth? To what extent has account 
been taken that cigarette smoking concentrates in families with lower socio-economic 
status, as evidenced by lower educational level and more unemployment etc. Rona et all 
also brought in the factor of overcrowding at home in their report that passive smoking 
resulted in some small reduction in the stature of children [12]: But even Rona etal. failed 
to take properly into account, as I have suggested, the role of some of these important 
factors on smoking rates in their' evaluation [il3]. 

What with subtlfc biases, not so subtle biases, and even extravagant errors, one should 
not accept too readily claimed demonstrations of ill effects of passive smoking. Passive 
smoking has been the favorite whipping boy of epidemiologists for too lbng already. They 
public is entitled not to be unnecessarily exposed to environmental tobacco smoke'but 
any panic is unjustified. 
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GLOSSARY 

Acute: Having a short course; of short duration. 

Animal study: A controlled laboratory experiment in which animals 
are exposed to an agent and the biological effects of this 
exposure are assessed. The exposure may be via food or water 
(ingestion), by injection, by external application or by 
inhalation. Typical effects that might be measured are tumor 
incidence or tissue and: organ' changes. 

Bias: Regarding epidemiologic studies, the operation' of factors 

in a study's design or execution that erroneously lead to the 
appearance of a stronger or weaker association between the 
agent in question and disease than in fact exists. 

Bioassay: The determination of the activity of a sample of an 

agent by noting its effect on a live animal or an isolated 
organ preparation. 

Carcinogen: A substance or agent designated as capable of producing 
or initiating cancer. 

Carcinogen classification system: A system for stratifying the 

weight of evidence for human carcinogenicity, for example, 
the system followed by the EPA. The EPA system consists of 
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the following levels: Group A — carcinogenic to humans; 

Group B — probably carcinogenic to humans; Group C — possibly 
carcinogenic to humans; Group D 1 — not classifiable as tO'human 
carcinogenicity; and Group E — evidence of non-carcinogenicity 
for humans. 1 

Case-control study: A type of epidemiologic study which compares 

diseased persons (cases)' with nondiseased persons (controls): 
in association with a common exposure to an agent. 

Chronic: Persisting: over a long period of time. Regarding animal 

studies, refers to administration of the test substance over 
a period of several weeks or months. 

Cohort study: An epidemiologic study which examines the development 
of a disease in a group (cohort) of persons who are currently 
free of the disease. May assess exposure either prospectively 
or retrospectively. 

Confounding: As applied to epidemiologic studies, the situation 

in which the relationship between an agent and a disease 
appears stronger or weaker than it truly is due to the 
influence of another unknown or unrecognized factor. In 

I. The definitions for carcinogen classification system, dose- 
response assessment, exposure assessment, hazard 
identification, risk assessment, risk characterization and 
weight of evidence are taken from the EPA's 1986 "Guidelines 
for Carcinogen Risk Assessment," 51 Fed. Reg. 185, 33992-34003. 
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confounding, the agent under consideration is associated with 
another agent (a confounding factor, or confounder) which is 
itself associated with either an increase or decrease in> the 
incidence of the disease. 

Dose-response assessment: Part of a risk assessment. Defines the 
relationship between the dose of an agent and the probability 
of induction of a carcinogenic effect. 

Environmental tobacco smoke (ETS): Consists of smoke originating 
from the smoldering end of a tobacco product between puffs, 
e.g., sidestream smoke, and of smoke exhaled' by the smoker. 

) The components are released' into the environment where they 

are diluted by ambient air and undergo changes related to 
aging over time. 

Epidemiology: The branch of science concerned with the patterns 

of disease in human populations and the various factors that 
influence these patterns. 

Exposure assessment: Part of a risk assessment. Identifies 

populations exposed to the agent, describes their composition 
and size, and presents the types, magnitudes, frequencies and 
durations of exposure to the agent. 

; 


Source: https://www.industrydocuments.ucsf.edu/docs/kypxOOOO 


2023512352 



Hazard identification': Part of a risk assessment. A qualitative 
assessment of risk, dealing with the process of determining 
whether exposure to an agent has the potential to increase 
the incidence of cancer. It qualitatively answers the question 
of how likely an agent is to be a human carcinogen. 

In vitro : Literally, within glass; used to refer to laboratory 

procedures conducted in a test tube or similar location, often 
involving preparations of cells or tissues. 

In vivo ': Literally, within the living body; used to refer to' 

laboratory procedures utilizing live animals. 

Mainstream smoke (MS): Tobacco smoke drawn through the butt end 

of a cigarette. 

Meta-analysis: A statistical technique for combining studies into 

a single analysis, designed! to increase the ability to 
statistically detect an association if such an association is 
present. 


Mutagen: An. agent that tends to increase the frequency or extent 

of mutation, i.e., physical or biochemical changes in th 
genetic material of an organism. 
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"Importance of the Indoor Environment in 


Binder, R., et al., 

Air Pollution Exposure," Arch Environ Health 31(6): 277-279, 

1976. 

Pharmacokinetics: The study of the action of chemical substances 

in the body over a period of time, including the processes of 
absorption, distribution, metabolism and excretion:. 

Relative risk: The ratio of the incidence rate of a disease among 
individuals exposed to a particular risk factor to the 
incidence rate among unexposed individuals. 

Risk assessment: The determination of adverse health consequences 
from exposure to toxic agents. [Will be carried out 

independently from considerations of the consequences of 
regulatory action.] Includes one or more of the following 
components: hazard identification, dose-response assessment, 

exposure assessment and risk characterization. 

Risk characterization: Part of a risk assessment. Combines the 

results of exposure assessment and dose-response assessment 
to estimate a carcinogenic risk in quantitative terms. 

Risk management: A combination of risk assessment with the 

directives of regulatory legislation, together with 
socioeconomic, technical, political and other considerations. 
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to reach a decision as to 1 whether or how much to control future 


exposure to suspected toxic agents. 

Short-term tests: In vitro (performed on cells or tissue cultures) 
tests for mutations, including tests for chromosome 
aberrations, DNA damage/repair and; other transformations which 
provide supportive evidence of cellular changes and may give 
information on carcinogenic mechanisms. 

Sidestream smoke (SS): Smoke originating from the smoldering end 
of a tobacco product between puffs. 

Statistical significance: A procedure to quantify the probability 
that an observed outcome, e.g., an association between an 
exposure and a disease endpoint, arose from random variation 
alone. The scientific community often uses 5% as a standard 
level at which data are accepted as occurring other than by 
chance. This means that there is a 95% probability that the 
results are not attributable to chance. 

Toxicology: The scientific study of poisons, their actions, their 

detection and the treatment of the conditions produced by them. 

Weight of evidence: A framework utilized by the EPA for judging 

the likelihood that an agent is a human carcinogen. Three 
major steps are involved: (!)• characterization of evidence 
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from human studies and from animal studies, individually; (2) 
combination of the characterizations of these two types of 
data into an indication of the overall weight of evidence; 
and (3): evaluation of all supporting information to determine 
if the overall weight of evidence should be modified. [See 
also definition for carcinogen classification system. ] 


10380571 
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DEFINITIONS 


In medical research, there are two major types of studies: 
experimental studies and observational studies. 

An Experimental Study requires that the members of a 
study population be assigned to either a treatment or control 
group. The treated and untreated groups are then followed 
prospectively to see whether the two groups subsequently differ in 
their disease experience. 

Def: An Observational Study is one in which the 
treatment or exposure of interest is not assigned but instead occurs 
by choice or by happenstance. 

The types of observational studies are the case report, 
the cross-sectional study, the ecologic study, the case-control 
study, and the cohort study (often called prospective). 

A Case Report is strictly speaking not a scientific 
study but a description of a small number of persons with an unusual 
disease or an unusual change in their disease status. 

A Cross-sectional Study reports the characteristics 
of a group of people at one point in time or a snapshot of their 
health picture. 
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The Ecoloqic Study uses data that are routinely 
collected (such as air pollution data) to study the occurrence of 
disease among groups of people. For example, heart disease 
incidence may be studied in a group of people where information on 
national dietary habits are known. 

A Case-control Study or Retrospective Study is one 
that begins with study subjects who have the disease of interest 
and a comparison group without the disease. The previous exposures 
of both groups are investigated. 

A Cohort Study or Prospective Study is one in which 
the researcher starts with one group of persons exposed to a factor 
of interest and another comparable group that is unexposed. These 
groups are observed at a later time to see whether they have 
developed differences which might be attributable to their 
different exposures. 

A Confounder is a factor which confuses the correct 
interpretation of the data relating to a suspect and disease. The 
confounding factor acts by being associated both with the exposure 
and the disease in a way that makes the exposure and the disease 
seem to be related. An example, which was published in 1978, 
related jet plane noise with an increased death rate. Upon 
reexamination, it was found that persons exposed to jet noise lived 1 
in devalued housing close to airports and were of a less fortunate 
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socioeconomic strata. When a proper analysis of these other factors 
was performed, jet noise was found to have no association with' 
increased mortality. 

Confounding is the process by which noncausal 
associations between two factors is produced by any association 
with a third factor known as the confounder. 

Bias is nonrandom error. Not related to the word bias 
used in the sense of prejudice. 

Risk (or absolute risk) is expressed as a death rate or 
disease rate. 

Relative Risk is the ratio or quotient of two risks or 
absolute risks. It is also known as a risk ratio. 

Odds Ratio is a measure of risk usually obtained from 
case-control studies and mathematically close to relative risk. 

p-value is a statistical estimate of the probability 
that a finding is due to chance. By convention, a finding with a 
p-value less than 5%, or sometimes 1%, is called statistically 
significant. 
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Statistical Association . Two factors are statistically 


associated when there is a tendency for the factors to occur 
together or to change together. The observed relationship is the 
statistical association, measured in many ways. 
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General Definition of Bias: 

Deviation of results or inferences from- the truth, or processes leading to such 
deviation. Any trend in the collection, analysis, interpretation, publication or 
review of data that can lead to conclusions that are systematically . different from 
the truth.' 1 


Definition of Specific Biases: 


Publication Biases: 


Reviewer Bias - Systematic error due to failure of journal editors to accept andi publish 
reports with negative, non-significant or contrary conclusions. 

File-drawer Bias - Systematic error due to failure of authors to submit reports on 
negative, non-significant, or contrary conclusions. 


Researcher Bias - Systematic error due to failure of authors to include negative, non¬ 
significant, or contrary conclusions in reports documenting multiple-endpoint studies. 

Subject Biases: 

Recall Bias - Systematic error due to differences between case and control subjects in 
accuracy or completeness of recall of prior events or experiences that maybe related to 
the medical endpoint of concern. 

Reporting Bias - Systematic error due to selective suppression (or revealing) by the 
subject of information' such as past history of other disease that is related to the medical' 1 
endpoint of concern. 

Misclassification Bias - Systematic error due to inclusion of subjects in case or control 
groups who do not meet exposure criteria. 

Medical Biases: 

Detection Bias - Systematic error due to differing methods of ascertainment, diag 
or verification of cases between exposure groups. 

Autopsy Bias - Systematic error resulting from the fact that autopsies represent a 
nonrandom sample of deaths. 


'•A Dictionary cf Epidemiology, Second Edition. Ed: Last )M. Oxford University Press, New York, 1986; 
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Statistical Biases: 


Design Bias - Systematic error due to faulty design of a study, including uncontrolled 
confounding, poorly defined 1 populations, and nonsimultaneous comparisons using 
historical controls. 

Sampling Bias - Systematic error due to nonrandom inclusion of subjects from the 
reference population because of availability of subjects, willingness of subjects to 
participate, criteria for selection, use of hospital cases and/or controls, and subsequent 
follow-up failure, withdrawal or exclusion from the study. 
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Foreword 


The Internationa! Epidemiological Association is extremely 
pleased that the Dictionary of Epidemiology has been so successful 
that a second edition has been demanded. As one of the Asso¬ 
ciation's aims is to “spread the message,” this work is an exam¬ 
ple oF “what we call it.” Only if we all understand the same 
thing when a particular term is used will the aim of the Asso¬ 
ciation be capable of being fulfilled. This dictionary is funda¬ 
mental to this objective. 

W. W. Holland, mo frcgf frgp ftcm 
President, international Epidemiological Association 
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This dictionary, appearing now in its second edition, is an at¬ 
tempt to bring some order to the occasionally chaotic nomen¬ 
clature of epidemiology. It is intended for all who are inter¬ 
ested in epidemiology', especially those who are beginning to 
study the subject, those whose first language is not English, and 
those from other fields who need to know the terms epide¬ 
miologists use. 

Like all rapidly expanding sciences, epidemiology has been 
confounded by the proliferation of words and phrases to de¬ 
scribe its concepts, principles, methods, and procedures. The 
creation of new terms and disagreement about the meaning of 
old ones can confuse beginners and established epidemiologists 
alike. 

Remarks by users of the first edition have reinforced the view 
that the boundaries should be wide rather than narrow, that 
the language should be simple, that some terms many epide¬ 
miologists think everyone already knows should be included. 
The second edition is larger than the first, partly for this rea¬ 
son, and because terms omitted from the first edition have been 
included and many old entries expanded. 

The dictionary is not an index of permitted and proscribed 
usage. I hope that it is authoritative without being authoritar¬ 
ian Where synonyms exist, the definition appears under the 
most commonly used of these, but preference for one term over 
another is not necessarily implied. In a few instances, the use 
of a term is deprecated. Some terms that are properly de¬ 
scribed as slang or jargon have been included because they are 
widely used and their meaning is not always clear from the con¬ 
text. Murphy’s description of jargon is worth recalling: "ob¬ 
scure and/or pretentious language, circumlocutions, invented 
meanings, and pomposity delighted in for its own sake." 

There was disagreement among the contributors to this edi¬ 
tion about including certain acronyms and eponyms. An acro¬ 
nym is a word made up of letters from two or more other words, 
e g. ANOVA for analysis of variance, or from initial letters, e g. 
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WHO for World Health Organization. All lay and technical vo¬ 
cabularies contain acronyms; epidemiology has its fair share. 
By convention, acronyms are spelt out the first time they ap¬ 
pear in a text, and, if they are numerous, considerate editors 
sometimes supply a glossary, or at least list the acronyms along 
with the words for which they stand in an index. Although this 
dictionary is not the place for extensive mention of acronyms, 
a few appeared in the first edition, and a few more appear 
here. 

Eponyms, the attachment of personal or place names to con¬ 
cepts, diseases, methods or specific studies, also occur often 
enough in published papers and books for us to recognize that 
beginners need some guidance to the meaning of those most 
widely used. Some appeared in the first edition, and a few have 
been added to the second—though again this dictionary is not 
the proper place for a full glossary of epidemiological eponyms 
(where would such a glossary end!). 

As was the case with the first edition, a large number of epi¬ 
demiologists from many countries have participated in this re¬ 
vision. The original modest notices in a couple of journals and 
a few casual remarks among friends produced a mailing list of 
some forty persons, mainly in North America and the United 
Kingdom. The mailing list rapidly grew until, by llie fifth round 
of correspondence in December 1986, there were 108 corre¬ 
spondents in 25 countries. The list continued to grow- after this 
fifth and final round; but the published roster of names that 
follows this preface is both more and less than the number of 
active participants. Some seemingly inquired just from curiosity 
and played no further part. Others wrote lengthy and often 
vigorously argumentative comments and suggestions express¬ 
ing not only their own views but those of colleagues in their 
academic department or institution—in one instance, col¬ 
leagues elsewhere in that nation. 

In addition to extensive comments from these correspon¬ 
dents, I have made good use of other technical dictionaries and 
glossaries in compiling this revision. All of these are listed in 
the bibiliography, and many are also to be found in footnotes 
that follow specific entries. 

The compilers of dictionaries must exercise the greatest care 
in the choice of words and in their arrangement. Most entries 
in this dictionary have been repeatedly discussed with many 
contributors, and in nearly all instances the wording has been 
agreed upon by all; on the rare occasions when agreement eluded 
us, the final decision was mine alone. Therefore, I accept full 
responsibility for the deficiencies in the finished product. 

The work has been sponsored by the International Epide- 
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miologica! Association, which provided partial travel support 
for me to attend two meetings in 1986; further support was 
provided by the McLean Foundation and the Milbank Memo¬ 
rial Fund. All royalties from the sale of this edition, like those 
from the first edition, will go to the International Epidemiol¬ 
ogical Association. 

Finally, I thank Jeffrey House of Oxford University Press for 
helpful advice and encouragement. 

Ottaiva, Canada J L. 

November 1987 
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abortion rate The estimated annual number of abortions per 1000 women of repro¬ 
ductive age (usually defined as age 15-44). 

abortion ratio The estimated number of abortions per 100 live births in a given year. 
abscissa The distance along the horizontal coordinate or jr axis, of a point P from the 
vertical or y axis of a graph. See also axis, graph. ordinate, 
absolute risk Usually this term means the observed or calculated risk of an event in 
a population under study, as contrasted with the relative risk. Sometimes, however, 
it is a synonym for attributable fraction, excess risk, or risk difference; because of 
the inconsistency, this term should be avoided. See also risk, 
acceptable risk The risk that has minimal detrimental effects, or for which the ben¬ 
efits outweigh the potential hazards. Epidemiologic study has provided data for 
calculation of risks associated with many medical procedures and also with occupa¬ 
tional and environmental exposures; these data are used, for instance, in clinical 

DECISION ANALYSIS, 

accuracy The degree to which a measurement, or an estimate based on measure¬ 
ments, represents the true value of the attribute that is being measured. See also 

MEASUREMENT, PROBLEMS WITH TERMINOLOGY. 
acquaintance network Group of persons in contact or communication among whom 
transmission of an infectious agent and of knowledge, attitudes, and values is pos¬ 
sible. and whose social interaction may have health implications. See also transmis¬ 
sion or infection. 

acquired immunodeficiency iyndrome (Syn: acquired immune deficiency syndrome) 
(AIDS) For surveillance purposes, the Centers for Disease Control, Atlanta. Geor¬ 
gia,' define a case of AIDS as an illness characterized by (1) one or more of a group 
of opportunistic or indicator diseases that are indicative of underlying cellular im¬ 
munodeficiency; (2) absence of all known underlying causes of cellular immuno¬ 
deficiency and absence of all other causes of reduced resistance to opf>ortunistic or 
indicator diseases. Additional criteria are serum positive for HIV antibody, positive 
culture for HIV, and reduction of T4 ’'helper" lymphocytes. 

The opportunistic or indicator diseases associated with AIDS include certain pro¬ 
tozoal and helminth infections, notably Pnrumocyitis carintt pneumonia and toxo¬ 
plasmosis; Tungal infections, notably candidiasis of esophagus, trachea, bronchi or 
lungs and cryptococcosis, especially affecting the central nervous system; bacterial 
infections, notably with certain mycobacteria; viral infections, notably cytomegalo¬ 
virus and herpes simplex; and cancer, notably Kaposi's sarcoma and lymphoma 
limited to the brain. 

AIDS-related complex (ARC) is the combination of HIV positive test with lymph- 
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ADL scale 4 

adenopathy and persistent low fever but without immunodeficiency or opportunis¬ 
tic diseases. 

1 1987 Revision of case definition of AIDS for surveillance purposes. MMWR 36, 15:45-9$. 1987. 

ACTIVITIES or daily livinc (adl) scale A scale devised by Katz and others' to score 
physical ^hilitv/disability; used to measure outcomes of interventions for various 
chronic disabling conditions such as arthritis. The scale is based on scores for re¬ 
sponses to questions about mobility, self-care, grooming, etc. This was the first widely 
used scale of this type; others, mostly refinements or variations of the ADL scale, 
have since been developed. 

'Katz S, Ford. AB. Moskowiu. RW. Jackson. BA, Jaffc. MW: Studies of illness in the aged. The 

index of ADL. a standardized measure of biological function. JAMA 185:914-919. 1963. 

ACTUARIAL RATE See FORCE Or MORTALITY. 

ACTUARIAL TABLE See LIFE TABLE. 

ACUTE 

1. Referring to a health effect, brief; sometimes loosely used to mean severe. 

2. Referring to exposure, brief, intense, or short-term; sometimes specifically re¬ 
ferring to brief exposure of high intensity. See also chronic. 

adaptation A heritable component of the phenotype which confers an advantage in 
survival and reproductive success. The process by which organisms adapt to envi¬ 
ronmental conditions. 

additive model A model in which the combined effect of several factors is the sum of 

v the effects that would be produced by each of the factors in the absence of the 

others. For example, if facior A' adds x9f to risk in the absence of and if factor I’ 

adds r9f to risk in the absence of A’, an additive model states that the two factors 

together will add to risk. Sec also interaction; linear model; mathemat¬ 

ical model; multiplicative model. 

adjustment A summarizing procedure for a statistical measure in which the effects of 
differences in composition of the populations being compared have been mini¬ 
mized by statistical methods. Examples are adjustment by regression analysis and 
by standardization. Adjustment often is performed on rates or relative risks, com¬ 
monly because of differing age distributions in populations that are being com¬ 
pared. The mathematical procedure commonly used to adjust rates for age differ¬ 
ences is direct or indirect standardization. 

adverse reaction, side effect Any undesirable or unwanted consequence of a pre¬ 
ventive, diagnostic, or therapeutic procedure. 

aetiology, actio logic See etiolocv, etiolooic. 

ACE DEPENDENCY RATIO See DEPENDENCY RATIO. 

agent (of disease) A factor, such as a microorganism, chemical substance, or form of 
radiation, whose presence, excessive presence, or (in deficiency diseases) relative 
absence is essential for the occurrence of a disease. A disease may have a single 
agent, a number of independent alternative agents (ai least one of which must be 
present), or a complex of two or more factors whose combined presence is essential 
for the development of the disease. See also causality; necessary and sufficient 
cause. 

AGE-PERIOD COHORT ANALYSIS See COHORT ANALYSIS. 

AGE-SEX PYRAMID See POPULATION PYRAMID. 

age-sex register List of all clients or patients of a medical practice or service, classi¬ 
fied by age (birthdale) and sex; provides denominator for calculating age- and sex- 
specific rates. 

age-specific fertility rate The number oT births occurring during a specified pe- 
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riod to women of a specified age group, divided by the number of person-years 
lived during that period by women of that age group. When an age-specific fertility 
rate is calculated for a calendar year, the number of births to women of the speci¬ 
fied age is usually divided by the midyear population of women of that age. 
ace-specific Rate A rate for a specified age group. The numerator and denominator 
refer to the same age group. 

Example: 

Number of deaths among residents 
Age-specific death age 25-34 in an area in a vear 

° r ■ - ■ m — - - y ] OQ 000 

rate (age 25-34) Average (or midyear) population 
age 25-34 in the area in that year 

The multiplier (usually 100,000 or 1,000,000) is chosen to produce a rate that can 
be expressed as a convenient number. 

ace standardization A procedure for adjusting rates, e.g. death rates, designed to 
minimize the effects of differences in age composition when comparing rates for 
different populations. See also adjustment, standardization, 
aggregation bias (Syn; ecological bias) See ecolocical fallacy, 
aging of the population A demographic lerm, meaning an increase over time in the 
proportion of older persons in the population, It does not necessarily imply an 
increase in life expectancy or that "people are living longer than they used to." The 
principal determinant of aging in the population has been a decline in the birth 
rale: when fewer children are bom than in prior years, the result, in the absence 
of a rise in the death rate at higher ages, has been an increase in the proportion of 
older persons in the population. In developed societies, however, mortality change 
is becoming a factor: little further mortality reduction can occur in the first half of 
life, so reductions are beginning to occur in the third and fourth quarters of file, 
leading to a rise in the proportion of older persons from this cause. 
airborne infection A mechanism of transmission of an infectious agent by particles, 
dust, or droplet nuclei suspended in the air. See also transmission or infection, 
algorithm Any systematic process that consists of an ordered sequence of steps with 
each step depending on the outcome of the previous one. The term is commonly 
used to describe a structured process, for instance, relating to computer program¬ 
ming or to health planning. See also decision tree. 

^/algorithm, clinical (Syn; clinical protocol) An explicit description of sicps to be taken 
in patient care in specified circumstances. This approach makes use of branching 
logic and of all pertinent data, both about the patient and from epidemiologic and 
other sources, to arrive at decisions that yield maximum benefit and minimum risk. 
ALLELE Alternative forms of a gene, occupying the same locus on a chromosome. 
ALPHA ERROR See ERROR. TYPE |. 

ALPHA LEVEL See SIGNIFICANCE LEVEL 

analysis of variance A statistical technique that isolates and assesses the contribution 
of categorical independent variables to variation in the mean of a continuous de¬ 
pendent variable. The observations are classified according to their categories for 
each of the independent variables, and the differences between the categories in 
their mean values on the dependent variable are estimated and tested for statistical 
significance. 

analytic rruov A study designed to examine associations, commonly putative or by- 
^ poihrr.ized causal relationships. An analytic study is usually concerned with iHrnii- 
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fying or measuring the effects of risk factors, or is concerned with the health effects 
of specific exposure(s). Contrast descriptive study, which does not test hypotheses. 
The common types of analytic study are cross-sectional, cohort, and case-con¬ 
trol. In an analytic study, individuals in the study population may be classified 
according to absence or presence (or future development) of specific disease and 
according to “attributes" that may influence disease occurrence. Attributes may in¬ 
clude age, race, sex, other disease(s), genetic, biochemical, and physiological char¬ 
acteristics, economic status, occupation, residence, and various aspects of the envi¬ 
ronment or personal behavior. See also Case control study; cohort study; cross- 
sectional STUDV; STUDY DESIGN. 

ANIMAL MODEL Study in a population of laboratory animals that uses conditions of an¬ 
imals analogous to conditions of humans to model processes comparable to those 
that occur in human populations. See also experimental epidemiology. 

antagonism Opposite of synergism. The situation in which the combined effect of two 
or more factors is smaller than the solitary effect of any one or the factors. In 
bioassay, the term may be used to refer to the situation when a specified response 
is produced by exposure to either of two factors but not by exposure to both to¬ 
gether. 

anthropometry The technique that deals with the measurement of the size, weight, 
and proportions of the human body. 

anthropophiuc (adj.) Pertaining to an insect’s preference for feeding on humans even 
when nonhuman hosts are available. 

antibody Protein molecule formed by exposure to a “foreign" or extraneous substance, 
e.g., invading microorganisms responsible for infection, or active immunization. May 
also be present as a result or passive transfer from mother to infant, via immune 
globulin, etc. Antibody has the capacity to bind specifically to the foreign substance 
(antigen) that elicited its production, thus supplying a mechanism for protection 
against infectious diseases. Antibody is epidemiologically important because its con¬ 
centration (titer) can be measured in individuals, and, therefore, in populations. 
See also seroepidemiologv. 

antigen A substance (protein, polysaccharide, glycolipid, tissue transplant, etc.) that is 
capable of inducing specific immune response. Introduction of antigen may be by 
the invasion of infectious organisms, immunization, inhalation, ingestion, etc. 

antigenic drift This term describes the “evolutionary” changes that take place in the 
molecular structure of DNA/RNA in micro-organisms during their passage from 
one host to another. It may be due to recombination, deletion or insertion of genes, 
to point mutations, or to several of these events. This process has been studied in 
common viruses, notably the influenza virus.' It leads to alteration (usually slow 
and progressive) in the antigenic composition, and thus in the immunologic re¬ 
sponses of individuals and populations to exposure to the micro-organisms con¬ 
cerned. See also antigenic shift. 

1 Palesr P. Young JF: Variation of Influenza A. B, and C Viruses. Science 215:1468-1473. 1982. 

antigenic shift This term describes mutation, i.e., a sudden change in molecular 
structure of DNA/RNA in micro-organisms, especially viruses, which produces new 
strains of the micro-organism. Hosts previously exposed to other strains have little 
or no acquired immunity. Antigenic shift is believed to be the explanation for the 
occurrence of strains of the influenza A virus associated with large-scale epidemic 
and pandemic spread. Antigenic shift is responsible for the susceptibility of host 
populations to a new strain of influenza virus. See also antigenic drift. 

ANTiCENicmr (Syn; immunogen icily) The ability of agem(s) to produce a systemic or a 
local immunologic reaction in the host. 
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arbovirus A group of taxonomically diverse animal viruses that are unified by an ep¬ 
idemiologic concept, i.e., transmission between vertebrate host organisms by blood- 
feeding (hematophagous) arthropod vectors such as mosquitoes, licks, sand flies, 
and midges. The term is a contraction of arihropod'bomt virus. 

The interaction of arbovirus, vertebrate host(s), and arthropod vector gives this 
class of infections several unique epidemiologic features. See vector-borne infec¬ 
tion for definition of terms used to describe these features. 

AREA SAMPUNG A method of sampling that can be used when the numbers in the pop¬ 
ulation are unknown. The total area to be sampled is divided into subareas, e.g., by 
means of a grid that produces squares on a map; these subareas are then numbered 
and sampled, using a table of random numbers. Depending upon circumstances, 
the population in the sampled areas may first be enumerated, then a second stage 
of sampling may be conducted. 

ARTthmetic mean The sum of all the values in a set of measurements, divided by the 
number of values in the set. 

artificial intelligence A branch of computer science in which attempts are made to 

Y duplicate human intellectual functions. One application is in diagnosis, in which 
computer programs are often based upon epidemiologic analyses of data in hospital 
charts or other clinical records. 

ascertainment The process of determining what is happening in a population or study 
group, e.g., family and household composition, occurrence oT cases of specific dis¬ 
eases; the latter is also known as case-finding. 

ascertainment bias Systematic failure (o represent equally all classes of cases or per¬ 
sons supposed to be represented in a sample. This bias may arise because of the 
nature of the sources from which persons come, e.g., a specialized clinic; from a 
diagnostic process influenced by culture, custom, or idiosyncracy; or. for example, 
in genetic studies, from the statistical chance of selecting from large or small fami¬ 
lies 

assay The quantitative or qualitative evaluation of a hazardous substance; the results 
of such an evaluation. 

association (Syn: correlation, [statistical] dependence, relationship) Statistical depen¬ 
dence between two or more events, characteristics, or other variables. An associa¬ 
tion is present if the probability of occurrence of an event or characteristic, or the 
quantity of a variable, depends upon the occurrence of one or more other events, 
the presence of one or more other characteristics, or the quantity or one or more 
other variables. The association between two variables is described as positive when 
the occurrence of higher values of a variable is associated with the occurrence of 
higher values of another variable. In a negative association, the occurrence of higher 
values of one variable is associated with lower values of the other variable. An as¬ 
sociation may be fortuitous or may be produced by various other circumstances; 
the presence of an association does not necessarily imply a causal relationship. If 
the use of the term “association" is confined to situations in which the relationship 
between two variables is statistically significant, the terms “statistical association" and 
“statistically significant association" become tautological. However, ordinary usage 
is seldom so precise as this. The terms "association" and "relationship" are often 
used interchangeably. 

Associations can be broadly grouped under two headings, symmetrical or non- 
causal (see below) and asymmetrical or causal. 

association, asymmetrical (Syn: asymmetrical relationship) The definitive conditions 
of asymmetrical associations are direction and time. Independent variable X must 
cause changes in dependent variable Y . and the “camal” variable mn*» 
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"effects.” Bradford Hill 1 and others 35 have pointed out that the (subjective) likeli¬ 
hood of a causal relationship is increased by the presence of the following attri¬ 
butes. However, temporality is the only indispensable condition among these. 

1. Consistency—The association is consistent if the results are replicated when 
studied in different settings and by different methods. 

2. Strength—This is an expression of the disparity between the frequency with 
which a factor is found in the disease and the frequency with which it occurs 
in the absence of the disease. Not to be confused with statistical significance. 

3. Specificity—This is established with the limitation of the association to a single 
putative cause and single effect. 

4. Dose-response relationship^This is established when an increased risk or se¬ 
verity in disease occurs with an increased quantity (“dose”) or duration of ex¬ 
posure to a factor. 

5. Temporality—The exposure to a putative cause always precedes, never fol¬ 
lows. the outcome. 

6. Biological plausibility—It is desirable that the association agree with current 
understanding of the response of cells, tissues, organs, and systems to stimuli. 
This criterion should not be applied rigidly. The association may be new to 
science or medicine. As Sherlock Holmes advised Dr. Watson, “When you have 
eliminated the impossible, whatever remains, however improbable, must be 
the truth." 

7. Coherence—The associations should not conflict with the generally known facts 
of the natural history and biology of disease. 

8. Experiment—It is sometimes possible to appeal to experimental, or quasi: 
experimental evidence, e.g., an observed association leads to some preventive 
action. Does this action in fact prevent? 

See also causality: evans’s postulates; koch's postulates. 

1 Bradford Hill A: The environment and disease: Association or causation. Ptxk ft<n 5or Med 58:295- 

300, 1905. 

f 5usser MW: Judgment and causal inference. Am J Efndrmiot 105:1-15, 1977. 

$ Rothman KJ (Ed): Causal Inference. Chestnut Hill, MA: Epidemiology Resources Inc., 1988. 

association, direct Directly associated, i.e., not via a known third variable: A-*B Re¬ 
fers only to causality. 

association, indirect causal Two types are distinguished: 

1. Association of a factor G with disease A only because both are related to a 
common underlying factor B. 

Alteration of factor C will not produce an alteration in the frequency to dis¬ 
ease A unless an alteration in G affects B. It has been suggested that to avoid 
confusion with the alternative meaning of indirect association, this type should 
be called "secondary association." 

2. Association of a factor C with disease A by means of an intermediate or inter¬ 
vening factor B. 

c'\ 

Alteration of factor C would produce an alteration in the frequency of dis¬ 
ease A. To avoid confusion, this type should be called "indirect causal asso¬ 
ciation.” 




9 attributable fraction 

association, spurious A term, preferably avoided, used with different meanings by 
different authors. It may refer to artifactual, fortuitous, false secondary, or to all 
kinds of noncausa) associations due to chance, bias, failure to control for extraneous 
variables, etc. 

association, symmetrical An association is noncausa! if it is symmetrical, as in the 
statement F = MA (force equals mass times acceleration). This is a noncausa I, non- 
directional expression of the mathematical relationship between the physical prop: 
enies of force, mass, and velocity. If one side of the equation is changed, then the 
other must also change to maintain equilibrium. 

Although epidemiologists are usually most interested in asymmetrical statements 
that have direction, the symmetrical equation can be useful. For instance, preva¬ 
lence can be expressed in terms of incidence and duration in the simple equation, 
P^/xD. If two of these three elements are known, the third can be derived. See 
also SYMMETRICAL RELATIONSHIP. 

assortative matinc Selection of a mate with preference (or aversion) for a particular 
genotype, i.e., nonrandom mating. 

ASYMMETRICAL ASSOCIATION See ASSOCIATION, ASYMMETRICAL. 

asymptotic Pertaining to a limiting value, for example, of a dependent variable, when 
the independent variable approaches zero or infinity. See large sample method. 

ASYMPTOTIC METHOD See LARGE SAMPLE METHOD. 

attack rate Attack rale, or case rate, is a cumulative incidence rate often used for 
particular groups, observed for limited periods and under special circumstances, as 
in an epidemic. 

The secondary attack rate is the number of cases among contacts occurring within 
the accepted incubation period following exposure to a primary case, in relation to 
the total of exposed contacts; the denominator may be restricted to susceptible con¬ 
tacts when determinable. 

Infection rate is the incidence of manifest plus inapparent infections, which can be 
identified, e.g., by seroepidemiology, 

attributable fraction (af) (Syn: attributable proportion) A term sometimes used to 
refer to the attributable fraction in the population, and sometimes to the attribut¬ 
able fraction among the exposed. See also attributable fraction (exposed); at¬ 
tributable fraction (population). 

attributable fraction (exposed) (Syn: attributable proportion {exposed), attribut¬ 
able risk, etiologic fraction (exposed)). With a given outcome, exposure factor and 
population, the attributable fraction among the exposed is the proportion by which 
the incidence rate of the outcome among those exposed would be reduced if the 
exposure were eliminated. It may be estimated by the formula 


• t 

where l r is the incidence rale among the exposed, /„ is the incidence rate among 
the unexposed; or by the formula 


AF,- 


RR- I 
HR 


where RR is the rate ratio, /,//„. It is assumed that causes other than the one under 
investigation have had equal effects on the exposed and unexposed groups. 
attributable fraction (population) (Syn: attributable proportion (population), eti¬ 
ologic fraction (population), attributable risk). With a given outcome, exposure fac¬ 
tor, and population, the attributable fraction among the population is the propor- 
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tion by which the incidence rate of the outcome in the entire population would be 
reduced if exposure were eliminated. It may be estimated by the formula 




>p 


where l p is the incidence rate in the total population and /„ is the incidence rate 
among the unexposed; or by the formula 


PARR- I) 

I + P t {HH - I) 

where HR is the rate ratio, tjl r It is assumed that causes other than the one under 
investigation have had equal effects on the exposed and unexposed groups. 
attributable number The number of new occurrences of a specific outcome attrib¬ 
utable to an exposure; it may be estimated using the formula 


where I r is the incidence rate among the exposed, /„ is the incidence rate among 
the unexposed, and N r is the number of persons in the exposed population. It is 
assumed that causes other than the one under investigation have had equal effects 
on the exposed and unexposed groups. 

attributable risk The rate of a disease or other outcome in exposed individuals that 
can be attributed to the exposure. This measure is derived by subtracting the rate 
of the outcome (usually incidence or mortality) among the unexposed from the rate 
among the exposed individuals; it is assumed that causes other than the one under 
investigation have had equal effects on the exposed and unexposed groups. Unfor¬ 
tunately, this term has been used to denote a number of different concepts, includ¬ 
ing the attributable fraction in the population, the attributable fraction among the 
exposed, the population excess rate, and the rate difference. Therefore, it should 
be defined carefully by all who use it. See also attributable fraction (exposed); 
population excess rate; attributable fraction (population); population at¬ 
tributable risk; rate difference. 

attributable risk (exposed) This term has been used with different connotations to 
denote the attributable fraction among the exposed and the excess risk among the 
exposed. See also attributable fraction (exposed); rate difference, 
attributable risk (population) This term has been used with different connotations 
to denote the attributable fraction in the population and the population excess risk. 
See also attributable fraction (population); population excess rate, 
attributable risk percent Attributable Traction expressed as a percentage rather 
than as a proportion. 

attributable risk PERCENT (exposed) This is the attributable fraction among the ex¬ 
posed, expressed as a percentage. See also attributable fraction (exposed), 
attributable Risr pebcent (population) This is the attributable fraction in the pop¬ 
ulation, expressed as a percentage. See also attributable fraction (population), 
attribute A qualitative characteristic of an individual or item. 

audit An examination or review that establishes the extent to which a condition, pro¬ 
cess, or performance conforms to predetermined standards or criteria. 
autopsy data Data derived from autopsied deaths, e.g., for study of natural history of 
disease and trends in frequency of disease. Autopsies are done on nonrandomly 
selected persons in the population and findings should therefore be generalized 
only with great caution. 
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average Kendall and Buckland’s Dictionary of Statistical Terms (4th Edition, 1982) has 
this to say. "A familiar but elusive concept. Generally an 'average* value purports 
to represent or to summarize the relevant features of a set of values; and in this 
sense the term would include the median and the mode. In a more limited sense 
an 'average* compounds all the values of the set, e.g., in the case of the arithmetic 
or geometric means. In ordinary usage, ‘the average' is often understood to refer 
to the arithmetic mean.” See also measures of central tendency. 

AVERAGE LIFE EXPECTANCY See EXPECTATION OF LIFE. 

AXIS 

J. One of the dimensions of a graph. A two-dimensional graph has two axes, the 
horizontal or x axis, and the vertical or y axis. Mathematically, there may be 
more than two axes, and graphs are sometimes drawn with a third dimension; 
the eye cannot comprehend more than three dimensions. 

2. In nosolocv, an axis of classification is the conceptual framework, e.g., etio- 
logic, topographic, psychologic, sociologic. The International Glassification of 
Disease, for example, is multiaxial; the primary axis is topographic (i e., body 
systems); secondary axes relate to etiology , manifestations of disease, detail of 
sites affected, severity, etc. 
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background levtl, rate The concentration, often low, at which some substance, agent, 
or event is present or occurs at a particular lime and place in the absence of a 
specific hazard or set of hazards under investigation. An example is the background 
level of naturally occurring forms of ionizing radiation to which we are all exposed. 

BAR diagram A graphic technique for presenting discrete data organized in such a 
way that each observation can fall into one and only one category of the variable. 
Frequencies are listed along one axis and categories of the variable along the other 
axis. The frequencies of each group of observations are represented by the lengths 
of the corresponding bars. See also histogram. 



Hfort Arthritis Visuot Hyp*rltns»on Diobtlft tfnporrmtnlt, 
conditions on* impoirmtnls without hoorl lower extremities 

rheumoftsm irwoJvemenl qnd hips 


Bar diagram. From Susser. Watson. Hopper, 1985. 

raves’ theorem A theorem in probability theory named for Thomas Bayes (1702- 
1761). an English clergyman and mathematician; his Esuxy Towards Solving a Problem 
in the Doctrine of Chances (1763, published posthumously), contained this theorem. 
In epidemiology, it is used to obtain the probability of disease in a group of people 
with some characteristic on the basis of the overall rate of that disease (the prior 
probability of disease) and of the likelihoods of that characteristic in healthy and 
diseased individuals. The most familiar application is in clinical decision analysis 
where it is used for estimating the probability of a particular diagnosis given the 
appearance of some symptoms or test result. A simplified version of the theorem is 

Si-CSTSCZOZ ,, 



„ . PtSlD)P(D) 

P(D\S) --;-t==s -=r 

P(S\D)P(D) + P(S\D)P{D) 

where 0* disease, 5“Symptom, and D = no disease. The formula emphasizes what 
clinical intuition often overlooks, namely, that the probability of disease given this 
symptom depends not only on how characteristic that symptom is of the disease but 
also on how frequent the disease is among the population being served. ”lf you 
hear hoof beats in the street, do not look for zebra.” 

The theorem can also be used for estimating exposure-specific rales from case 
control studies if there is added information about the overall rate of disease in that 
population. 

Some of the terms in the theorem have special names. The probability of disease 
given the symptom is called the "posterior probability.” It is an estimate of the 
probability of disease posterior to knowing whether or not the symptom was pres¬ 
ent. The overall probability of disease among the population or our guess or the 
probability of disease before knowing of the presence or absence of the symptom 
is called the “prior probability.” The theorem is sometimes presented in terms of 
the odds of disease before knowing the symptom (prior odds) and after knowing 
the symptom (posterior odds). 

behavioral epidemic An epidemic originating in behavioral patterns (as opposed to 
invading microorganisms or physical agents). Examples include the dancing manias 
of the Middle Ages, episodes of mass fainting or convulsions ("hysterical epidem¬ 
ics”). crowd panic, or waves of fashion or enthusiasm. The communicable nature of 
the behavior is dependent not only on person-to: person transmission of the behav¬ 
ioral pattern but also on group reinforcement (as with smoking, alcohol, or drug 
use). Behavioral epidemics may be difficult to differentiate from, or may compli¬ 
cate. outbreaks of organic disease, for example, due to contamination of the envi¬ 
ronment by a toxic substance. 

behavioral risk factor A characteristic or behavior (hat is associated with increased 
probability or a specified outcome; the term does not imply a causal relationship. 

benchmark A slang or jargon term, usually meaning a measurement taken at the out¬ 
set of a series of measurements of the same variable, sometimes meaning the best 
or most desirable value of the variable. Because of uncertainty about meaning, the 
term should not be used. 

benefit—cost ratio The ratio of net present value or measurable benefits to costs. 
Calculation of a benefit-cost ratio is used to determine the economic feasibility or 
success of a program. 

Bernoulli distribution The probability distribution associated with two mutually ex¬ 
clusive and exhaustive outcomes, e.g., death or survival; a Bernoulli variable is one 
that has only two possible values, e g., death or survival. See also binomial distri¬ 
bution. 

berkson’s bias See bias, selection. 

BETA ERROR See ERROR. TYPE II. 

bias Deviation of results or inferences from the truth, or processes leading to such 
deviation. Any trend in the collection, analysis, inierprrtation. publication, or re¬ 
view of data that can lead to conclusions that are systematically different from the 
truth. Among the ways in which deviation from the truth can occur, are the follow¬ 
ing: 

I. Systematic (one-sided) variation of measurements Trom the true values (syn: 
systematic error). 
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bias, ascertainment M 

2. Variation of statistical summary measures (means, rates, measures of associa¬ 
tion, etc.) from their true values as a result of systematic variation of measure- 
menu, other Raws in data collection, or flaws in study design or analysis. 

3. Deviation of inferences from the truth as a result or flaws in study design, 
dau collection, or the analysis or interpretation of results. 

4. A tendency of procedures (in study design, data collection, analysis, interpre¬ 
tation, review or publication) to yield results or conclusions that depart from 
the truth. 

5. Prejudice leading to the conscious or unconscious selection of study proce¬ 
dures that depart from the truth in a particular direction, or to one-sidedness 
in the interpretation of results. 

The term ‘'bias” does not necessarily carry an imputation of prejudice or other 
.subjective factor, such as the experimenters desire for a particular outcome. This 
differs from conventional usage in which bias refers to a partisan point of view. 

Many varieties of bias have been described. 1 

1 Sacketi DL: Bias in analytic research. J Chron Du 32:51-63, 1979. 

bias, ascertainment Systematic error, arising from the kind of individuals or patients 
(e.g., slightly ill, moderately ill, acutely ill) that the individual observer is seeing. 
Also systematic error arising from the diagnostic process (which may be determined 
by the culture, customs, or individual idiosyncrasy of the person providing care for 
the patient). 

bias, in assumption (Syn: conceptual bias) Error arising from lauliy logic or premises 
or mistaken beliefs on the part of the investigator. False conclusions about the ex¬ 
planation for associations between variables. Example: Having correctly deduced 
the mode of transmission of cholera, John Snow concluded that yellow fever was 
transmitted by similar means. In fact, the "miasma” theory would better fit the facts 
of yellow f fever transmission. 

bias in autopsy series Systematic error resulting from the fact that autopsies repre¬ 
sent a nonrandom sample of all deaths. 

bias, berkson’s See bias, selection. 

BIAS DUE TO CONFOUNDING Sec CONFOUNDING. 

blas, design The difference between a true value and that actually obtained, occurring 
as a result of faulty design of a study. Some examples are (I) uncontrolled studies 
where the effects of two processes cannot be separated (confounding), (2) con¬ 
trolled studies where observations are based on a poorly defined population, and 
(3) nonsimuluneous comparisons, e.g., use of historical controls. 

bias, detection Due to systematic error(s) in methods of ascertainment, diagnosis, or 
verification of cases in an epidemiologic survey, study, or investigation. Example: 
Verification of diagnosis by laboratory tests in hospital cases, but failure lo apply 
the same tests lo cases outside the hospital. 

BIAS DUE TO DIGIT PREFERENCE See DIGIT PREFERENCE. 

bias in handling outliers Error arising from a failure to discard an unusual value 
occurring in a small sample, or due to exclusion of unusual values (hat should be 
included. 

bias, information (Syn: observational bias) A flaw in measuring exposure or outcome 
that results in differential quality (accuracy) of information between compared groups. 

bias due to instrumental error Systematic error due to faulty calibration, inaccur¬ 
ate measuring instruments, contaminated reagents, incorrect dilution or mixing of 
reagents, etc. 

has of interpretation Error arising from inference and speculation. Sources of the 
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error include (I) failure of the investigator to consider every interpretation consis¬ 
tent with the facts and to assess the credentials of each, and (2) mishandling of 
cases that constitute exceptions to some general conclusion. 

bias, interviewer Systematic error due to interviewers’ subconscious or even con : 
scious gathering of selective data. 

bias, 44 lead-time” A systematic error arising when follow-up or two groups does not 
begin at strictly comparable times. Occurs especially when one group has been di¬ 
agnosed earlier in the natural history of the disease than the other group. See also 
ZERO TIME SHIFT. 

bias, lencth A systematic error due to the selection of a disproportionate number of 
long-duration cases (cases who survive longest) in one group and not in the other. 
Can occur when prevalent cases, rather than incident cases, are included in a case 
control study. 

bias, measurement Systematic error arising from inaccurate measurement (or classifi¬ 
cation) of subjects on the study variables. 

bias, observer Systematic difference between a true value and that actually observed 
due to observer variation. Observer variation may be due to differences among 
observers (interobserver variation) or to variation in readings by the tame observer 
on separate occasions (intraobserver variation). See also observer variation. 

bias in the presentation of data Error due to irregularities produced by digit pref¬ 
erence, incomplete data, poor techniques of measurement, or technically poor lab¬ 
oratory standards. 

bias in publication An editorial predilection for publishing particular findings, e.g., 
positive results, which leads to the failure of authors to submit negative findings for 
publication or failure of journal editors to accept and publish reports with negative 
findings. This can distort the general belief about what has been demonstrated in a 
particular situation. 

bias of an estimator The difference between the expected value of an estimator of a 
parameter and the true value of this parameter. See also unbiassed estimator. 

bias, recall Systematic error due to differences in accuracy or completeness of recall 
to memory of prior events or experiences. Example: Mothers whose children have 
had or have died of leukemia are more likely than mothers of healthy living chil¬ 
dren to remember details of diagnostic x-ray examinations lo which these children 
were exposed in utero. 

bias, reporting Selective suppression or revealing of information such as past history 
of sexually transmitted disease. 

bias, response Systematic error due to difference in characteristics between those who 
choose or volunteer io participate in a study and those who do not. 

BIAS, sampling Unless the sampling method ensures lhat all members of the "universe” 
or reference population have a known chance of selection in the sample, bias is 
possible. The best way to ensure a known chance of selection for all is to use a 
probability sampling method such as a table of random numbers. 

BIAS, SELECTION Error due lo systematic differences in characteristics between those 
who are selected for study and those who are not. Examples include hospital cases 
or cases under a physician s care, excluding those who die before admission to hos¬ 
pital because the course of their disease is so acute, those not sick enough to require 
hospital care, or those excluded by distance, cost, or other factors. Selection bias 
also invalidates generalizable conclusions from surveys that would include only vol¬ 
unteers from a healthy population. 

A special example is berkson’s bias , 1 which Berkson characterized as the set of 
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bii* due to withdrawals 


selective factors that lead hospital cases and controls in a case control study to be 
systematically different from one another. This occurs when the combination of 
exposure and disease under study increases the risk of hospital admission, thus 
leading to a systematically higher exposure rate among the hospital cases than the 
hospital,controls. This in turn results in systematic distortion of the odds ratio. 

1 Brrkson J: Limitations of the application of fourfold table analysis to hospital dau. Biometrics Bull 

2:47—53, 1946. 

BIAS DDE to withdrawals A difference between the true value and that actually ob¬ 
served in a study due to the characteristics of those subjects who choose to with¬ 
draw. 

Bills or Mortality Weekly and annual abstracts of christenings and burials, distin¬ 
guishing deaths from the plague, compiled for London (and some other cities), 
especially in times of plague, from the English parish registers that started in 1538. 
From 1629, the annual bill was published regularly and included a breakdown of 
deaths bv cause. These records were the basis for the earliest vital statistics, com¬ 
piled, analvzed, and discussed by John Graunt in Natural and Political Observations 
. . . on the Bills of Mortality (1662). 

bimodal distribution A distribution with two regions of high frequency separated by 
a region of low frequency of observations. A two-peak distribution. 

binary variable A variable having only two possible values, e g. on or off, 0 or l. See 
also BIT. 

binomial distribution A probability distribution associated with two mutually exclu¬ 
sive outcomes, e.g., presence or absence of a clinical or laboratory sign, death, or 
survival. The probability distribution of the number of occurrences or a binary 
event in a sample of n independent observations. The binomial distribution is used 
to model cumulative incidence rates and prevalence rates. The Bernoulli dis¬ 
tribution is a special case of the binomial distribution with n*= 1. 

bioassay The quantitative evaluation of the potency of a substance by assessing its ef¬ 
fects on tissues, cells, live experimental animals, or humans. 

Bioassay may be a direct method of estimating relative potency: groups of sub¬ 
jects are assigned to each of two (or more) preparations; the dose that is just suffi¬ 
cient to produce a specified response is measured, and the estimate is the ratio of 
the mean doses for the two (or more) groups. In this method, the death of the 
subject may be used as the “response.“ 

The indirect method (more commonly used) requires study of the relationship 
between the magnitude of a dose and the magnitude of a quantitative response 
produced by it. 

biological plausibility The criterion that an observed, presumably or putativelv causal 
association fits previously existing biological or medical knowledge. This judgment 
should be used cautiously since it could impede development of new knowledge 
that does not Rt existing ideas. 

BIOLOGICAL TRANSMISSION See VECTOR-BORNE INFECTION. 

BIOMETRY [literally, the measurement of life] The application or statistical methods to the 
study of numerical data based on biological observations and phenomena. The term 
was coined by W. F. R. Weldon (1860—1906), a zoologist at University College, 
London. Francis galton has been called “the father of biometry” for his applica¬ 
tion of statistical methods to the analysis of biological variation. However, others 
preceded him, e.g., quetelet and louis. 

riostatijtics Application of statistics to biological problems. The term is considered 
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by many biomedical scientists to mean (he application of statistics specifically to 
medical problems, but its real meaning is broader. 

Biraud, Yves (1900-1965) French physician and statistician. He served the League of 
Nations and later WHO as Director of Epidemiological and Statistical Services from 
1925 to I960. In 1960, he founded the first chair oT Health Statistics in France, at 
the Ecole dr sanlt publiqu* in Rennes. 

RiRTft certificate Official, legal document recording details of a live birth, usually 
comprising name, dale, place, identity of parents, and sometimes additional infor¬ 
mation such as birth weight. It provides the basis for vital statistics oT birth and 
birthrates in a political or administrative jurisdiction, and for the denominator for 
infant mortality and certain other vital rates. 

birth cohort See COHORT. 

BIRTH COHORT ANALYSIS See COHORT ANALYSIS 

birth interval Interval between termination of one completed pregnancy and the 
termination of the next, 

birth order The ranking of siblings according to age, starting with the eldest in a 
family. The ordinal number of a given live birth in relation to all previous live 
births of the same women. Thus, 4 is the birth order of the fourth live birth occur¬ 
ring to the same woman. This strict demographic definition may be loosened to 
incrude all births, i.e., still-births as well as live births 

birth rate A summary rate based on the number of live births in a population over a 
given period, usually one year. 

Number or live births to residents 

, in an area in a calendar year 

Birth rate = —----x 1000 

Average or midyear population 

in the area in that year 




birth weight Infant's weight recorded at the lime of birth and. in some countries, 
entered on the birth certificate. Certain variants of birth weight are precisely de¬ 
fined. Low birth weight (LBW) is below 2500 g. Very low birth weight (VLBW) is 
below 1500 g. Ullralow birth weight (ULBW) is below 1000 g. Large for gestational 
age (LGA) is birth weight above the 90th percentile. Average weight for gestational 
age (AGA) (Syn: appropriate or adequate): birth weight between |<)th and 90th 
percentiles. Small for gestational age (SGA) (Syn: small for dates): birth weight 
below 10th percentile. 

bit Acronym for binary digit; the signal in computing. See also byte. 
m black box” A jargon term, meaning a method of reasoning or studying a problem, 
in which the methods, procedures, etc., as such are not described, explained, or 
perhaps even understood. Nothing is stated or inferred about the method; discus¬ 
sion and conclusions relate solely to the empirical relationships observed. An alter¬ 
native definition is the following: A method of formally relating an input, e.g., 
quantity of a drug absorbed over a period or a putative causal factor, to an output, 
e.g., the amount of the drug eliminated in a given period, or an observed effect, 
without making detailed assumptions about the mechanisms that have contributed 
to the transformation of input to output within the organism (the “black box”). 
iund(ed) study (Syn: masked study) A study in which observer(s) and/or subjects are 
kept ignorant of the group to which the subjects are assigned, as in an experiment. 
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or of the population from which the subjects come, as in a nonexpenmentaI study. 
When both observer and subjects are kept ignorant, we refer to a doublerblind 
study. If the statistical analysis is also done in ignorance of the group to which 
subjects belong, the study is sometimes described as triple-blind. The intent of keeping 
subjects and/or investigators blinded, i.e., unaware of knowledge that might intro¬ 
duce a bias, is to eliminate the effects of such biases. To avoid confusion about the 
meaning of the word “blind" some authors prefer to describe such studies as 
“masked." 

blocked randomization See stratified randomization. The analogue in a random¬ 
ized experiment of individual matching in an observational study. 

body mass index (Syn: Quetclcl's index) One of the anthropometric measures of body 
mass. Defined as (weight) + (height) 5 . This measure has the highest correlation 
with skinfold thickness or body density and in this respect is superior to the pon- 

DF.RAL INDEX. 

bootstrap A technique for estimating the variance and the bias of an estimator by 
repeatedly drawing random samples with replacement from the observations at hand. 
One applies the estimaior to each sample drawn, thus obtaining a set of estimates. 
The observed variance of this set is the bootstrap estimate of variance. The differ¬ 
ence between the average of the set of estimates and the original estimate is the 
bootstrap estimate of bias. 

breakpoint In helminth epidemiology, the critical mean wormload in a community, 
below which the helminth mating frequency is too low to maintain reproduction A 
value exceeding the breakpoint of a wormload means that the wormload will in¬ 
crease until equilibrium is reached; a value less than or equal to the breakpoint 
means that the wormload will decrease progressively. 

byte A group of adjacent bits, commonly 4, 6, or 8, operating as a unit for storage and 
manipulation of data in a computer. See also bit. 
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CALIPER MATCHING See MATCHING. 

Canadian mortality data base A large set of computer-stored death statistics; per¬ 
sonal identifiers and causes or all deaths in Canada since 1950 have been computer- 
stored, and the death certificates have been preserved on microfiche. This data base 
and record linkage have been used in some important historical cohort studies. See 
also NATIONAL DEATH INDEX. 

cancer registry See register. 

CARRIER 

I. A person or animal that harbors a specific infectious agent in the absence of 
discernible clinical disease and serves as a potential source of infection. The 
carrier state may occur in an individual with an infection that is inapparent 
throughout its course (known as healthy or asymptomatic carrier), or during 
the incubation period, convalescence, and postconvalescence of an individual 
with a clinically recognizable disease (known as incubaton carrier or convales¬ 
cent carrier). The carrier state mav be of short or long duration (temporary 
or transient carrier or chronic carrier). 1 

1 Adapted from Control of Communicable Otseaic tn Afart, Hth ed. Washington. DC: American Public 

Health Association. 1985. 

carrying capacity An estimate of the numbers or people that a nation, region, or the 
planet can sustain. 

case In epidemiology, a person in the population or study group identified as having 
the particular disease, health disorder, or condition under investigation. A variety 
of criteria may be used to identify cases, e g., individual physicians’ diagnoses, re¬ 
gistries and notifications, abstracts of clinical records, surveys of the general popu¬ 
lation. population screening, and reporting of defects such as in a dental record. 
The epidemiologic definition of a case is not necessarily the same as the ordinary 
clinical definition. 

CASE-RASE study A study that starts with the identification and sampling of persons 
with the disease or interest, and then samples the entire base population (of cases 
and noncases) from which the original cases arose. This design is similar to a case 
control study in most respects, but cases may appear in the comparison (base) 
sample as well as in the case sample. 

CASE, COLLATERAL A case occurring in the immediate vicinity of a case which has been 
the subject of an epidemiological investigation; a term used mainly in malaria con- 
trol programs, equivalent to the term contact as used in infectious disease epide¬ 
miology. 

CASE COMPARISON STUDY See CASE CONTROL STUDY. 

CASE COMPEER STUDY See CASE CONTROL STUDY. 
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causation of disease 


CASE CONTUOL rruov (Syn: case comparison study, ease compeer study, case history 
study, case referent study, retrospective study) A study that starts with the identifi¬ 
cation of persons with the disease (or other outcome variable) of interest, and a 
suitable control (comparison, reference) group or persons without the disease. The 
relationship of an attribute to the disease is examined by comparing the diseased 
and nondiseased with regard to how frequently the attribute is present or, if quan¬ 
titative, the levels of the attribute, in each of the groups. 

Such a study can be called “retrospective” because it suns after the onset of 
disease and looks back to the postulated causal factors. Cases and controls in a case 
control study may be accumulated “prospectively;” that is, as each new case is di¬ 
agnosed it is entered in the study. Nevertheless, such a study may still be called 
"retrospective" because il looks back from the outcome to its causes. The terms 
“cases" and “controls" are sometimes used to descrilje subjects in a randomized 
controlled trial but. the term “case control study" should not be used to describe 
such a study. 

The terms “case control study" and “retrospective study" have been used most 
often to describe this method. Other terms also used are listed above. The concept 
of the case-control sludv is to be found in the works of P.C.A. Louis;' the first 
explicit description of the method is contained in a paper by William Augustus Guy. 
who reported his analvsis of the relationship between prior occupational exposure 
and the occurrence of pulmonary consumption to the Statistical Society of London 
in 1843.- The evolution of the case-control stud) thereafter has been described by 
Lilienfeld and Lilienfeld.' The first modern use of the method was a case-control 
studv of breast cancer, reported by Lane-Glaypon'* in 1920; from that time onward, 
case-control studies became increasingly popular and widely used. 

'Louis I’GA: Researches on Phthisis; Anatomical. Pathological and Therapeutical. (Trans. W.H. 

Wnlsbe). London: Svdcnham Society. 1844. 

-(■in WA: Contributions to a knowledge of the influence ol employments on health. J Ro\ Slot Sor 

6:197-211.1843. 

'Lilienfeld AM, Lilienfeld D: A century of case-control studies—progress. J Chron [)ts 32:3-13. 

1979. 

* Lane-Glaypon JE: A further report on cancer of the breast. Rrpf Pub Hlth Mtd Subj 32. London: 

HMSO. 1926 

case fatality rate The proportion of cases of a specified condition which are fatal 
within a specified time. 

Number of deaths from a disease 

Case fatality rate (usually _ (in a given period) ^ 

expressed as a percentage) * Number or diagnosed cases of that disease X 

(in the same period) 

This definition can lead to paradox when more persons die of the disease than 
develop it during a given period. For instance, chemical poisoning that is slowly but 
inexorably Talal may cause many persons to develop the disease over a relatively 
short period of lime, but the deaths may not occur until some years later and may 
be spread over a period oT years during which there are no new cases. Thus, in 
calculating the case fatality rate, it is necessary to acknowledge that the lime dimen¬ 
sion varies: it may be brief, e g., covering only the period of stay in a hospital, of 
finite duration, e.g., one year, or of longer duration still. The term “case fatality 
rate” is then better replaced by a term such as “survival rate” or by the use of a 
survivorship table. See also attack rate. 
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CASE HISTORY STUDY 

1. Synonym for casf. control study. 

2. In clinical medicine, a case report, or a report on a series of cases. 
case referent study See case control studv. 

catastrophe theory A branch of mathematics dealing with large changes in the total 
system that may result from small changes in a critical variable in the system. An 
example is the sudden change in the physical Slate of water into steam or ice with 
rise or fall of temperature beyond a critical level. Certain epidemics, gene frequen¬ 
cies. and behavioral phenomena in populations may abide by the same mathemati¬ 
cal rule. Herd immunity is an example. 

catchment area Region, which may be well- or ill-defined, from which the clients of 
a particular health facility are drawn. 

causality The relating of causes to the effects they produce. Most of epidemiology 
concerns causality and several types of causes can be distinguished. It should be 
clcarlv stated, however, that epidemiologic evidence by itself is insufficient to estab¬ 
lish causality. 

A cause is termed “necessary" when it must alwavs precede an effect. This effect 
need not be the sole result of the one cause. A cause is termed “sufficient” when it 
inevitably initiates or produces an effect. Any given cause may be necessary, suffi¬ 
cient. neither, or both. These possibilities are explained below. 

Four conditions under which independent variable A' may cause Y 

variable A* may cause )’ 

X is X is 

necessary sufficient 

1. + + 

2 . + 

3. + 

4 

1. X is necessary and sufficient to cause I . both X and Y are always present 
together, and nothing but X is needed to cause I', X—*F. 

2. X is necessary but not sufficient to cause Y. X must be present when F is pres¬ 
ent. but F is not alwavs present when X is. Some additional factor(s) must also 
be present: X and Z-+Y. 

3. X is not necessary but is sufficient to cause F. Y is present when X is. but X 
may or may not be present when Y is present, because F has other causes and 
ran occur without X. For example, an enlarged spleen can have many separate 
causes that are unconnected with each other; X—*F; Z~*Y. 

A. X is neither necessary’ nor sufficient to cause F. Again, X may or may not be 
present when )' is present. Under these conditions, however, if X is present 
with Y, some additional factor must also be present. Here X is a contributory 
cause of F in some causal sequences; X and Z~*F; W and Z—►F. These relation¬ 
ships and the logic of causal inference are discussed in CauMtl Inference .’ 

'Rothman KJ (Ld): Ceuuil Infrrmre Chestnut H»M. MA: Epidemiology Resources Inc . I9H8. 

causation or disease, rACTORS in The following factors have been differentiated (but 
they are not mutually exclusive): 

Predisposing factors are those that prepare, sensitize, condition, or otherwise create 
a situation such as a level of immunity or state of susceptibility so that the host 
tends to react in a specific fashion to a disease agent, personal interaction, environ¬ 
mental stimulus, or specific incentive. Examples include age. sex, marital status, 
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family size, educational level, previous illness experience, presence of concurrent 
illness, dependency, working environment, and attitudes toward the use of health 
services. These factors may be ''necessary” but are rarely "sufficient” to cause the 
phenomenon under study. 

Enabling fatten are those that facilitate the manifestation of disease, disability, ill- 
health, or the use of services or conversely those that facilitate recovery from illness, 
maintenance or enhancement of health status, or more appropriate use of health 
services. Examples include income, health insurance coverage, nutrition, climate, 
housing, personal support systems, and availability of medical care. These factors 
may be “necessary” but are rarely "sufficient” to cause the phenomenon under study. 

Precipitating facton are those associated with the definitive onset of a disease, ill¬ 
ness, accident, behavioral response, or course of action. Usually one factor is more 
important or more obviously recognizable than others if several are involved and 
one may often be regarded as "necessary.” Examples include exposure to specific 
disease, amount or level of an infectious organism, drug, noxious agent, physical 
trauma, personal interaction, occupational stimulus, or new awareness or knowl- 
edge. 

Reinforcing facton are those tending to perpetuate or aggravate the presence of a 
disease, disability, impairment, attitude, pattern of behavior, or course of action. 
They may tend to be repetitive, recurrent, or persistent and may or may noi nec¬ 
essarily be the same or similar to those categorized as predisposing, enabling, or 
precipitating. Examples include repealed exposure to the same noxious stimulus (in 
the absence of an appropriate immune response) such as an infectious agent, work, 
household, or interpersonal environment, presence of financial incentive or disin¬ 
centive, personal satisfaction, or deprivation. 

CAUSES OF DEATH See DEATH CERTIFICATE. 

cause-deleted ufe table A life table constructed using death rates lowered by elim¬ 
inating the risk of dying from a specified cause; its most common use is to calculate 
the gain in life expectancy that would result from the elimination of one cause. 

cause-specific rate A rate that specifies events, such as deaths, according to their 
cause. 

censoring This term refers to the loss of subjects from a follow-up study; the occur¬ 
rence of the event of interest among such subjects is uncertain after a specified time 
when it was known that the event of interest bad not occurred; it is not known, 
however, if or when the event of interest occurred subsequently. Such subjects are 
described as censored. For example, in a follow-up study with myocardial infarction 
as the outcome of interest, a subject who has not had an infarct but is killed in a 
traffic crash in year 6 is described as censored as of year 6, since it cannot be known 
when, if ever, he might have had an infarct at a later year of follow-up. This is 
censoring by competing risk; other varieties include loss to follow-up and termina¬ 
tion of the study. Examination of data for censoring requires the use of special 
analytic methods, such as life table analysis. 

census An enumeration of a population, originally intended for purposes of taxation 
and military service. Census enumeration of a population usually records identities 
of all persons in every place of residence, with age, or birth date, sex, occupation, 
national origin, language, marital status, income, and relationship to head of house- 
, hold, in addition to information on the dwelling place. Many other items of infor¬ 
mation may be included, e.g., educational level (or literacy), and health-related data 
such as permanent disability. A de facto census allocates persons according to their 
location at the time of enumeration. A de jure census assigns persons according to 
their usual place of residence at the time of enumeration. 

TSCZTSCZOZ 



23 

census tract An area for which details of population structure are separately tabu: 
lated at a periodic census; normally it is the smallest unit of analysis of (published) 
census tabulations. Census tracts are chosen because they have well-defined bound¬ 
aries, sometimes the same as local political jurisdictions, sometimes defined by con¬ 
spicuous geographical features such as main roads, rivers. In urban areas census 
tracts may be further subdivided, e g., into city blocks, but published tables do not 
contain details to this level. 

CENTSLE see QUANTILES. 

cessation experiment Controlled study in which an attempt is made to evaluate the 
termination of an exposure to risk such as a living habit that is considered to be of 
eliologic importance. 

chart The medical dossier of a patient. See also information system; medical re¬ 
cord. 

check digit A single digit, derived from a multidigit number such as a case identifi¬ 
cation numl>cr. that is used as a screening test for transcription errors. 

CHEMOPROPHVLAXis The administration of a chemical, including antibiotics, to prevent 
the development of an infection or the progression of an infection to active mani¬ 
fest disease. 

chemotherapy The use of a chemical to treat a clinically recognizable disease or to 
limit its further progress. 

child death rate The number of deaths of children aged 1-4 years in a given year 
per 1000 children in this age group. This is a useful measure of the burden of 
preventable communicable diseases in the child population. 

Chi-square (^*) distribution A variable is said to have a chi-square distribution with 
K degrees of freedom if it is distributed like the sum of the squares of A* indepen¬ 
dent random variables, each of which has a normal distribution with mean zero and 
variance one. 

chi-square (4'*) test Any statistical test based on comparison of a lest statistic to a chi- 
square distribution. The oldest and most common chi-square tests are for delecting 
whether two or more population distributions differ from one another; these tests 
usually involve counts of data, and may involve comparison of samples from the 
distributions under study, or the comparison of a sample to a theoretically expected 
distribution. The Pearson chi-square test is probably the best known; another is the 
Mantel-Haenszel test. (Statisticians disagree about the terminal letter; a bare ma¬ 
jority of those who contributed to the discussion of this entry prefer “chi-square” 
rather than “chirsquared” Either usage is acceptable.) 

chrisoms This word, which appears in Bills of Mortality, means infants who die 
before formal baptism; therefore, the number recorded in Bills of Mortality can lie 
used to estimate (albeit inaccurately) neonatal death rates in studies of historical 
demography and epidemiology. 

chronic I. Referring to a health-related state, lasting a long time. 2. Referring lo ex-_ 
posure, prolonged or long-term, often with specific reference to low-intensity. 3. 
The U S. National Center for Health Statistics defines a “chronic” condition as one 
of three months’ duration or longer. 

class A term used in the theory of frequency distributions. The total number of ob¬ 
servations made upon a particular variate may be grouped into classes according to 
convenient divisions of the variate range in order lo make subsequent analyses less 
laborious, or for other reasons. A group so determined is called a ’’class.” The 
variate values that determine the upper and lower limits of a class are called "class 
boundaries,” the interval between them is the class interval, and the h»-ournrv fall- 
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classification (Syn: categorization) Assignment to predesignated classes on the basis 
of perceived common characteristics. A means of giving order to a group of discon¬ 
nected facts. Ideally, a classification should be characterized by (I) naturalness—the 
classes correspond to the nature of the thing being classified. (2) exhaustiveness— 
every member of the group will fit into one (and oniy one) class in the system, (3) 
usefulness^the classification is practical, (4) simplicity—the subclasses are not ex¬ 
cessive, and (5) constructability—the set of classes can be constructed by a demon¬ 
strably systematic procedure. 

classification of diseases Arrangement of diseases into groups having common 
characteristics. Useful in efforts to achieve standardization, and therefore compa¬ 
rability, in the methods of presentation of mortality and morbidity data from dif¬ 
ferent sources. Mav include a systematic numerical notation for each disease entry. 

Examples include the international classification of diseases, injuries, and 
causes or death (icd) and the international classification or health problems 
in primary care (ichppc). 

class, social A method of socially stratifying populations, e.g., according to education, 
income, or occupation. See also socioeconomic classification. 

clinical decision analysis Application of decision analysis in a clinical setting with 
the aim of applying epidemiologic and other data on probability of outcomes when 
alternative decisions can be made, e.g., surgical intervention or drug treatment for 
mvocardial ischemia. 

clinical epidemiologist A practitioner of clinical epidemiology . 

clinical epidemiology While some epidemiologists deplore any adjectival qualifica¬ 
tion of the discipline, a subspecialty or clinical epidemiology is sufficiently demar¬ 
cated to justify definition. There are plenty of suggested definitions. John R. Paul 1 
proposed "A marriage between quantitative concepts used by epidemiologists to 
studv disease in populations and decision-making in the individual case which is the 
daily fare of clinical medicine." Patient care is central to Sacketl's definition 1 ': “The 
application, by a phvsician who provides direct patient care, of epidemiologic and 
biometric methods to the study of diagnostic and therapeutic processes in order to 
effect an improvement in health." While limiting the discipline to medical graduates 
in clinical practice, this definition is conceptually close to the definition of clinical 
decision analysis; the proper distinction between clinical epidemiology and clinical 
decision analysis may be that the epidemiologist works with a defined population, 
even if it is a population of patients rather than a community-based population with 
numerator and denominator in the conventional epidemiologic sense; clinical deci¬ 
sion analysis can be applied to a single patient. Abramson's definition 1 is "The use 
of epidemiological principles, methods and findings in personal health care or 
community-oriented primary care, with special reference to applications in diag¬ 
nostic and prognostic appraisal, decisions concerning care and the evaluation of 
care. The term sometimes refers to anv epidemiological study conducted in a clin¬ 
ical selling." Weiss 4 defines clinical epidemiology as ”The study of variation in the 
outcome of illness and of the reasons for that variation." The existence of the above 
and other subtly different definitions suggests that this branch or epidemiology 
remains inchoate. 

Ctm InifU 17:539-541. 1938. 

Mm J tfndrmtot 89 125-128. 1969 

’Personal communication, 1986. 

*Chmrol EfndrmtoiofF New York: Oxford University Press, 1986. 

cunical trial (Syn: therapeutic trial) A research activity that involves the administra¬ 
tion of a test regimen to humans to evaluate its efficacy and safety. The term is 


ro 



25 cohort ilopet 

subject to wide variation in usage, from the first use in humans without any control 
treatment to a rigorously designed and executed experiment involving test and con¬ 
trol treatments and randomization. 

See also community trial. 

clinimetrics Fcinstcin. 1 who coined this term, defines it as the domain concerned with 
indexes, rating scales, and other expressions (hat are used to describe or measure 
symptoms, physical signs, and other distinctly clinical phenomena in clinical medi r 
cine. Such measurements, of course, are an essential part of many epidemiologic 
studies. 

*Frinsiein AR: Clmimrtrui. New Haven and London: Yale Univenity Press, 1987. 

CLOSED COHORT A population in which membership begins at a defined time or with a 
defined event and ends only through occurrence of the study outcome or the end 
of eligibility Tor membership. An example is a population of women in labor being 
studied to determine the vital status of their offspring (i.e., whether live or still¬ 
born). 

cluster analysis A set of statistical methods used to group variables or observations 
into strongly interrelated subgroups. 

clustering (Syn: disease cluster, time cluster, time-place cluster) A closely grouped 
series of events or cases of a disease or other health-related phenomena with well- 
defined distribution patterns, in relation to time or place or both. The term is nor¬ 
mally used to describe aggregation of relatively uncommon events or diseases, e.g., 
leukemia, multiple sclerosis. 

cluster sampling A sampling method in which each unit selected is a group of per- 
«ms (all persons in a city block, a family, etc.) rather than an individual. 

coding Translation or information, e.g., questionnaire responses, into numbered cate¬ 
gories for entry in a data processing system. 

coefficient OF variation The ratio of the standard deviation to the mean. This 
is meaningful only if the variable is measured on a ratio scale. See measurement 
scale. 

cohort |from Latin cohon , warriors, the tenth part of a legion] 

1. The component of the population born during a particular period and iden¬ 
tified by period of birth so that its characteristics (e.g., causes of death and 
numbers still living) can be ascertained as it enters successive time and age 
periods. 

2. The term "cohort" has broadened to describe any designated group of per¬ 
sons who arc followed or traced over a period of time, as in cohort study 
(prospective study). 

cohort analysis The tabulation and analysis of morbidity or mortality rates in rela¬ 
tionship to the ages of a specific group of people (cohort), identified at a particular 
period of time and followed as they pass through different ages during part or all 
of their life span. In certain circumstances, e.g., studies of migrant populations, 
cohort analysis may be performed according to duration of residence of migrants 
in a country rather than year of birth, in order to relate health or mortality expe¬ 
rience to duration of exposure. 

cohort component method A method of population projection that lakes the popu¬ 
lation distributed by age and sex at a base date and carries it forward in time on 
the basis of separate allowances for fertility, mortality, and migration. 

cohort effect See generation effect. 

COHORT INCIDENCE See INCIDENCE. 

cohort I LOPES Arrangement of data so that when plotted graphically, lines connect 
points representing the age-specific rales for population segments from the same 
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Cohort Curves for year* of birth, 1860-1950* 



Cohort slopes (tuberculosis mortality rates of successive birth generations). Death rates fot 
tuberculosts, by age, United States, 1900-1960 (pet 100,000 population). 

From Susser, Watson, Hopper, 1985. 

generation of birth (see diagram). These slopes represent changes in rales with age 
during the life experience of each cohort. 

cohort study (Syn: concurrent, follow-up, incidence, longitudinal, prospective study) 
The method of epidemiologic study in which subsets of a defined population can 
be identified who are, have been, or in the future may be exposed or not exposed, 
or exposed in different degrees, to a factor or factors hypothesized to influence the 
probability of occurrence of a given disease or other outcome. The alternative terms 
for a cohort study, i.e., follow-up, longitudinal, and prospective study, describe an 
essential feature of the method, which is observation of the population for a sufli- 
cienl number of person-years to generate reliable incidence or mortality rates in 
the population subsets. This generally implies study of a large population, study 
for a prolonged period (years), or both. 

coi intervention In a randomized controlled trial, the application of additional di¬ 
agnostic or therapeutic procedures to members of either or both the experimental 
and the control groups. 

cold chain A system of protection against high environmental temperatures for heat- 
labile vaccines, sera, and other active biological preparations. Unless the cold chain 
is preserved, such preparations are inactivated and immunization procedures, etc. 
will.be ineffective. Preservation of the cold chain is an integral pan of the WHO 
expanded program on immunization in tropical countries. 

coluncaRity Very high correlation between variables. 

colonization See infection. 

commensal Literally, eating together (sharing the same table); an organism that lives 
harmlessly in the gut. See also xenobiotic, 

common source epidemic (Syn; common vehicle epidemic) See epidemic, common 

SOURCE. 
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COMMON vehicle spread Spread of disease agent from a source that is common 
to those who acquire the disease, e g., water, milk, shellfish, foods, air, or syringe 
contaminated by infectious or noxious agents. See also transmission of infec¬ 
tion. 

communicable disease (Svn: infectious disease) An illness due to a specific infectious 
agent or its toxic products that arises through transmission of that agent or its 
products from an infected person, animal, or reservoir to a susceptible host, either 
directly or indirectly through an intermediate plant or animal host, vector, or the 
inanimate environment. See also transmission of infection. 

communicable period The lime during which an infectious agent may be transferred 
directly or indirectly from an infected person to another person, from an infected 
animal to man. or from an infected person to an animal, including arthropods. See 
also transmission of infection. 

communtty A group of individuals organized into a unit, or manifesting some unifying 
trait or common interest; loosely, the locality or catchment area population for which 
a service is provided, or more broadly, the state, nation, or body politic. 

community diagnosis The process of appraising the health status of a community, 
including assembly of vital statistics and other health-related statistics and of infor¬ 
mation pertaining to determinants of health, such as prevalence of tobacco smok¬ 
ing. and examination of the relationships of these determinants to health in the 
specified community. The term may also denote the findings of this diagnostic pro¬ 
cess. Community diagnosis may attempt to be comprehensive, or may be restricted 
to specific health conditions, determinants, or subgroups. J.N. Morris' identified 
community diagnosis as one of the uses of epidemiology. 

'Br Mrd J 2.395-401. 1955. 

community health See public health. 

community medicine Since the late 1960s, this term has gained wide currency as the 
preferred name for important activities concerning health care in the community. 
There are several different definitions, including the following. 

1. The field concerned with the study of health and disease in the population of 
a defined community or group. Its goal is to identify the health problems and 
needs of defined populations, to identify means by which these needs should 
be met, and to evaluate the extent to which health services effectively meet 
these needs. 

2. The practice of medicine concerned with groups or populations rather than 
with individual patients. This includes the elements listed in definition 1, to¬ 
gether with the organization and provision of health care at a community or 
group level. 

3. The term is also used to describe the practice of medicine in the community, 
e g., by a family physician. Some writers equate the terms “family medicine*’ 
and “community medicine”; others confine its use to public health practice. 

4. Community-oriented primary health care is an integration of community 
medicine with the primary health care or individuals in the community. In 
this form of practice the community practitioner or communtty health learn 
has responsibility for health care both at a community and at an individual 
level. 

See also public health; social medicine. 

communtty trial Experiment in which the unit of allocation to receive a preventive or 
therapeutic regimen is an entire community or political subdivision. Examples in¬ 
clude the trials of fluoridation of drinking water, and of heart disease prevention 
in North Karelia (Finland) and California. See also clinical trial. 


Source: https://www.industrydocuments.ucsf.edu/docs/kypxOOOO 





comorbidity 


contamination 


28 

comorbidity Disease(s) that coexist(s) in a study participant in addition to the index 
condition that is the subject of study. 

comparison group Any group to which the index group is compared. Usually synony¬ 
mous with control group. 

competing c^use When a previously common cause of death becomes rare, other causes 
become more prominent. These other causes are referred to as competing causes. 
For instance, among voting adults, pneumonia and other infections were a common 
cause of death until about midway through the 20lh century; their control has 
brought to prominence some competing causes of death, notably malignant disease 
and suicide. 

competing risk An event that removes a subject from being at risk for the outcome 
under investigation. For example, in a study of smoking and cancer or the lung, a 
subject who dies of coronary heart disease is no longer at risk of lung cancer, and 
in this situation, coronar}- heart disease is a competing risk. 
completed FERTILITY rate The number of children born alive per woman in a cohort 
or women by the end of their child-bearing years. 
completing the clinical picture The use of epidemiology to define all modes of 
presentation of a disease, and/or all possible outcomes. One of the “uses of epide¬ 
miology" identified by J.N. Morris. 1 
'flf MrH l 2:395-401. 1955 

completion rate The proportion or percentage of persons in a survey for whom 
complete data are available for analysis. See also response rate, 
composite index An index, such as the Apgar score. Tumor/Nodes/Meiasuies (TNM) 
stage of cancer, that contains contributions from categories of several different vari¬ 
ables. 

computer A programmable electronic device that can be used to store and manipulate 
data in order to earn out designated functions. The two fundamental components 
of a computer are hardware, i.e., the actual electronic device, and software, the 
instructions or program used to carry out the function. Computer science has cre¬ 
ated a large language of its own, describing types of computers (main-frame, micro, 
digital, analogue, etc.) and all aspects of the process. Most of the terms used in ibis 
field are defined by AJ Meadows. M Gordon, and A Singleton. 1 
' Dictionary of A 'nr Information Technology. London; Onlury, 19H2. 

concordance Pairs or groups of individuals of identical phenotype. In twin studies, a 
condition in which both twins exhibit or fail to exhibit a trait under investigation. 
concordant A term used in twin studies to describe a twin pair in which both twins 
exhibit a certain trait. 
concurrent study See cohort study. 

conditional probabilfty The probability of an event, given that another event has 
occurred. If D and E are two events and P{. . .) is “the probability of (. . .)." the 
conditional probability of D, given that E occurs, is denoted P(D\E), where the ver¬ 
tical slash is read "given" and is equal to P(D and E)fP(E). The event E is the "con¬ 
ditioning event." Conditional probabilities obey all the axioms of probability theory. 
See also raves* theorem; probability theory, 
confidence interval A range of values for a variable of interest, e.g., a rate, con¬ 
structed so that this range has a specified probability of including the true value of 
the variable. The specified probability is called the confidence level, and the end 
points of the confidence interval are called the confidence limits. 
confounding (from the Latin amfundere, lo mix togetherJ 

I. A situation in which the effects of two processes are not separated. The dis- 
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tortton of the apparent effect of an exposure on risk brought about by the 
association with other factors that can influence the outcome. 

2. A relationship between the effects of two or more causal factors as observed 
in a set of data, such that it is not logically possible to separate the contribution 
that any single causal factor has made to an effect. 

3. A situation in which a measure of the effect of an exposure on risk is distorted 
because of the association of exposure with other factor(s) that influence the 
outcome under study. 

confounding variable (Syn: confounder) A variable that can cause or prevent the 
outcome of interest, is not an intermediate variable, and is not associated with the 
factor under investigation. Such a variable must be controlled in order to obtain an 
undistorted estimate of the effect of the study factor on risk. 
consanguine Related by a common ancestor within the previous few generations. 

CONSISTENCY 

I. Close conformity between the findings in different samples, strata, or popu¬ 
lations, or at different limes or in different circumstances, or in studies con¬ 
ducted by different methods or different investigators. Consistency may be 
examined in order to study effect modification. Consistency of results on rep¬ 
lication or studies is an important criterion in judgments of causality. 

2 In statistics, an estimator is said to be consistent if the probability of it yielding 
estimates close to the true value approaches one as the sample size grows larger 
contact (OF an infection) A person or animal that has been in such association with 
an infected person or animal or a contaminated environment as to have had op¬ 
portunity to acquire the infection. 

contact, direct A mode of transmission of infection between an infected host and 
susceptible host. Direct contact occurs when skin or mucous surfaces touch, as in 
shaking hands, kissing, and sexual intercourse. See also contagion; transmission 

OF INFECTION. 

contact, indirect A mode of transmission of infection involving fomites or vectors. 
Vectors may be mechanical (e.g.. filth flies) or biological (the disease agent under¬ 
goes part of its life cycle in the vector species). See also transmission of infection, 
contact, frimary Person(s) in direct contact or associated with a communicable dis¬ 
ease case. 

contact, secondary Person(s) in contact or associated with a primary contact. 
contagion The transmission of infection by direct contact, droplet spread, or contam¬ 
inated fomites. These are the modes of transmission specified by fragastorius in 
Dr Contagion* (1546); contemporary usage is sometimes looser, but use of this lerm 
is best restricted to description of infection transmitted by direct contact. 
contagious Transmitted by contact; in common usage, "highly infectious." 
containment The concept of regional eradication of communicable disease, first pro¬ 
posed by Soper in 1949 for the elimination of smallpox.' Containment or a world¬ 
wide communicable disease demands a globally coordinated effort so that countries 
that have effected an interruption of transmission do not become reinfecied follow¬ 
ing importation from neighboring endemic areas. 

'Pan American Health Organization, OSP. CE7, W-15. Waihingion DC. 1949. 
contamination 

I. The presence of an infectious agent on a body surface; also on or in clothes, 
bedding, toys, surgical instruments or dressings, or other inanimate articles or 
substances including water, milk, and food. Pollution is distinct from contam¬ 
ination and implies the presence of offensive, but not necessarily infectious, 
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matter in the environment. Contamination or a body surface does not imply a 
carrier state. See also transmission of infection. 

2. The situation that exists when a population being studied for one condition 
or factor also possesses other conditions or factors that modify results of the 
study. In a randomized controlled trial, the inadvertent application of the 
experimental procedure to members of the control group, or inadvertent fail¬ 
ure to apply the procedure to members of the experimental group. 
contingency table A tabular cross-classification of data such that subcategories of one 
characteristic are indicated horizontally (in rows) and subcategories of another 
characteristic are indicated vertically (in columns). Tests of association between the 
characteristics in the columns and rows can be readily applied. The simplest contin¬ 
gency table is the fourfold, or 2x2 table. Contingency tables may be extended to 
include several dimensions of classification. 
contingent variable See intermediate variable. 

continuing SOURCE EnDEMic (outbreak) An epidemic in which new cases or disease 
occur over a long period, indicating persistence of the disease source. 
continuous DATA, CONTINUOUS variable Data (variable) with a potentially infinite 
number of possible values along a continuum. Data representing a continuous vari¬ 
able include height, weight, and enzyme output. 

CONTROL 

1. (v.) To regulate, restrain, correct, restore to normal. 

2. (n. or adj.) Applied to many communicable and some noncommunicable con¬ 
ditions, control means ongoing operations or programs aimed at reducing 
the incidence and/or prevalence, or eliminating such conditions. 

3. (n.) As used in the expressions case-control study and randomized control(led) 
trial, “control'* means person(s) in a comparison group that differs, respec¬ 
tively, in disease experience or allocation to a regimen, from the subjects of 
the study. 

4. (v.) In statistics, “control* means to adjust for or take into account extraneous 
influences or observations. 

5. (adj.) In the expression “control variable" we refer to an independent variable 
other than the hypothetical causal variable that has a potential effect on the 
dependent variable and is subject to control by analysis. 

The use of the noun “control" to describe the comparison groups in a case con¬ 
trol study and in a randomized comrol(led) trial can confuse the uninitiated, e g., 
ethical review committees; the essential ethical distinction is that there may be no 
intervention in the lives or health status of the controls in a case-control study, 
whereas controls in a randomized controlled trial may be asked to undergo a pro¬ 
cedure or regimen that may affect their health; their informed consent is therefore 
essential Consent may not be required (save to gain access to medical records) to 
study controls in a case-control study. As M.W. Susser 1 has pointed out, the use of 
the word “control' as verb, adjective, and noun may confuse even careful readers. 
The verb is best used in the sense of controlling sources of extraneous variation in 
the dependent variable, whether by design or analysis. The verb is also used in the 
sense of controlling disease or its causes. The adjective is best used to describe 
control variables in contradistinction to uncontrolled and confounding variables. 
The a djective *1*° c* n be used to describe a control group assembled for compari¬ 
son with a group of cases or with an experimental group. The noun is best used to 
designate the members of a control group. 

1 Causal Thinking in (hr Health New York: Oxford, 1973. 

controls, historical Persons or patients used for comparison who had the condition 
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or treatment under study at a different time, generally at an earlier period than 
the study group or cases. Historical controls are often unsatisfactory because other 
factors affecting the condition under study may have changed to an unknown ex¬ 
tent in the lime elapsed. 

controls, hospital Persons used for comparison who are drawn from the popula¬ 
tion of patients in a hospital. Hospital controls are often a source of selection 
bias. 

controls, matched Controls who are selected so that they are similar to the study 
gToup, or cases, in specific characteristics. Some commonly used matching variables 
are age, sex, race, and socioeconomic status. See also matching. 

controls, neighborhood Persons used for comparison who live in the same locality 
as cases and therefore may resemble cases in environmental and socioeconomic 
criteria. 

controls, sibling Persons used for comparison who are the siblings of cases and 
therefore share genetic makeup. 

coordinates In a two-dimensional graph, the values of ordinate and abscissa that de¬ 
fine the locus or position of a point. 

cordon sanitaire The barrier erected around a focus of infection. Used mainly in the 
isolation procedures applied to exclude cases and contacts of life-threatening com¬ 
municable diseases from society. Mainly of historical interest. 

correlation The degree to which variables change together. 

correlation coefficient A measure of association that indicates the degree to which 
two variables have a linear relationship. This coefficient, represented by the letter 
r, can vary between + I and - I; when r * +1, there is a perfect positive linear 
relationship in which one variable varies directly with the other; when r = - I, 
there is a perfect negative linear relationship between the variables. The measure 
can be generalized to quantify the degree oT linear relationship between one vari¬ 
able and several others, in which case it is known as the multiple correlation coef¬ 
ficient. Kendall s Tau. Spearman's Rank Correlation, and Pearson’s Product Mo¬ 
ment Correlation tests are special varieties with occasional applications in 
epidemiology. M.G. Kendall and W.R. Auckland s Dictionary of Statistical Terms' gives 
details. 

'London: Longman. 1983. 

correlation, nonsense A meaningless correlation between two variables. Nonsense 
correlations sometimes occur when social, economic, or technological changes have 
the same trend over time as incidence or mortality rates. An example is correlation 
between the birth rate and the density of storks in parts of Holland and Germany. 
See also confounding; ecological fallacy. 

cost-benefit analysis An economic analysis in which the costs of medical care and 
the loss of net earnings due to death or disability are considered. The general rule 
for the allocation of funds in a cost-benefit analysis is that the ratio of marginal 
benefit (the benefit of preventing an additional case) to marginal cost (the cost of 
preventing an additional case) should be equal to or greater than I. 

cost-effectiveness analysis This form of analysis seeks to determine the costs and 
effectiveness of an activity, or to compare similar alternative activities to determine 
the relative degree to which they will obtain the desired objectives or outcomes. 
The preferred action or alternative is one that requires the least cost to produce a 
given level or effectiveness, or provides the greatest effectiveness for a given level 
of cost. In the health care field, outcomes are measured in terms of health status. 

coBT-imurr analysis An economic analysis in which outcomes are measured in terms 
of their social */.tuc. 
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covariate A variable that is possibly predictive of the outcome under study. A covar¬ 
iate may be of direct interest to the study or may be a confounding variable or 
effect modifier. 

coverage A measure of the extent to which the services rendered cover the potential 
need for these services in a community. It is expressed as a proportion in which 
the numerator is the number of services rendered, and the denominator is the 
number of instances in which the service should have been rendered. Example'. 

Number of deliveries attended by a 
Annual obstetric coverage qualified midwife or obstetrician 
in a community Expected number of delivenerduring 

the year in a given community 

Gox model See proportional hazards model. 

criterion A principle or standard by which something is judged. See also standard. 

Cronracii’s alpha (Syn: internal consistency reliability) An estimate of the correlation 
between the total score across a series of items from a rating scale and the total 
score that would have been obtained had a comparable series of items been em¬ 
ployed. 

cross-cultural STudv A study in which populations from different cultural back¬ 
grounds are compared, 

crossover design A method of comparing two or more treatments or interventions in 
which the subjects or patients, upon completion of the course of one treatment, are 
switched to another. In the case of (wo treatments, A and B, half the subjects are 
randomly allocated to receive these in the order A. B and half to receive them in 
the order B. A. A criticism of this design is that effects of the first treatment may 
carry over into the period when the second is given. 

cross-product ratio See odds ratio. 

cross-sectional studv (Syn: disease frequency survey, prevalence study) A study that 
examines the relationship between diseases (or other health-related characteristics) 
and other variables of interest as they exist in a defined population at one particular 
time. The presence or absence of disease and the presence or absence of the other 
variables (or, if they are quantitative, their level) are determined in each member 
of the study population or in a representative sample at one particular lime. The 
relationship between a variable and the disease can be examined (I) in terms of the 
prevalence of disease in different population subgroups defined according to the 
presence or absence (or level) of the variables and (2) in terms or the presence or 
absence (or level) of the variables in the diseased versus the nondiseased. Note that 
disease prevalence rather than incidence is normally recorded in a cross-sectional 
study. The temporal sequence of cause and effect cannot necessarily be determined 
in a cross-sectional stud). See also morbidity survev. 

crude death rate See death rate. 

cumulative death rate The proportion of a group that dies over a specified time 
interval. It may refer to all deaths or to deaths from specific cause(s). If follow-up 
is not complete on all persons the proper estimation of this rate requires the use of 
methods that lake account of censoring. Distinct from force of mortality. 

cumulative incidence, cumulative incidence rate The number or proportion of a 
group of people who experience the onset of a health-related event during a spec¬ 
ified time interval; this interval is generally the same for all members or the group, 

ssmscao?. 



33 cyst count 

but, as in lifetime incidence, it may vary from person to person without reference 
to age. 

cumulative incidence ratio The ratio of the cumulative incidence rate in the ex¬ 
posed to the cumulative incidence rate in the unexposed. 

CUSUM Acronvm for cumulative sum (of a series of measurements). This is a useful way 
to demonstrate a change in trend or direction of a series of measurements ’ Cal¬ 
culation begins with a reference figure, e g. the expected average measurement As 
each new measurement is observed, the relerence figure is subtracted, and a cu¬ 
mulative total is produced by adding each successive difference This cumulative 
total is the cusum. 

‘Altlerwn M: An Introduction to Epidemiology, 2nd ed. London: Macmillan. I9H3 

cyclicity, seasonal The annual cycling of incidence on a seasonal basts. Certain acute 
infectious diseases, if of greater than rare occurrence, peak in one season of the 
vear and reach the low point six months later (or in the opfKisite season) The onset 
ol some symptoms of some chronic diseases also max show this amplitudinal cy¬ 
clicity. Demographic phenomena such as marriage and births, and mortaltl) Irom 
all causes and certain specific causes, may also exhibit seasonal cyclicity. 

CYCLICITY, secular Long-term (greater than one year) cycling of disease incidence For 
example, measles in a large, unimmunized population has a high incidence ever)' 
second vear: hepatitis A has a higher incidence every seventh year. Such cycling is 
the result of continuous exhaustion and replacement of susceptiblrs in a relaiively 
stable population. Secular cyclicity may have large interval swings as in the recur¬ 
rence of pandemics of influenza. 

CYST COUNT See WORM COUNT. 
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data dredging A jargon term, meaning analyses done on a post hoc basis without 
benefit of preslated hypotheses, as a means of identifying noteworthy differences. 
Such analyses are sometimes done when data have been collected on a large num¬ 
ber of variables and hypotheses are suggested by the data; the scientific validity of 
data dredging is at best dubious, usually unacceptable. 
data processing Conversion (as by computer) of crude information into usable or 
storable form. Data generated by epidemiologic studies are usually transferred to 
punch cards or optical mark-sense forms and thence to a computer for storage and 
retrieval. The term is often loosely used to mean also the statistical analysis of data 
bv a computer program. See also punch card, 
death certificate A vital record signed by a licensed physician or, in some nations, 
by another designated health worker, that includes cause of death, decedent's name, 
sex, birthdale, and place of residence and of death. Occupation, birthplace, and 
other information may be included. Immediate cause of death is recorded on the 
first line, followed by conditions giving rise to the immediate cause; the underlying 
cause is entered last. The underlying cause is coded and tabulated in official pub¬ 
lications of cause-specific mortality. Other significant conditions may also be re- 
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corded separately, as is the mode of death, whether accidental or violent, etc. The 
most important entries on a death certificate are underlying causes of death and 
cause of death. These are defined in the Nsnth (1975) Rnnston of the International 
Classification of Diseases, as follows: 

Causes of Heath: The causes of death to be entered on the medical certificate of 
cause of death are all those diseases, morbid conditions, or injuries that either re¬ 
sulted in or contributed to death and the circumstances of the accident or violence 
which produced any such injuries. 

Underlying cause of death: The underlying cause of death is (I) the disease or injury 
that initiated the train of events leading to death, or (2) the circumstances of the 
accident or violence that produced the fatal injury. 

Personal identifying information such as birthplace, parents' names (last name at 
birth), and birthdates are included on death certificates in some jurisdictions; this 
extra information makes possible a range of record linkage studies. 
death rate An estimate of the proportion of a population that dies during a specified 
period. The numerator is the number of persons dying during the period; the 
denominator is the size of the population, usually estimated as the mid-vear popu¬ 
lation. The death rate in a population is generally calculated by the formula 

Number of deaths during 
a specified period ^ 

Number of persons af risk 
of dving during the period 

This rate is an estimate of the person-time death rate, i.e., the death rate per 10" 
person-years. If the rate is low, it is also a good estimate of the cumulative death 
rate. This rate is also called the crude death rate. 
death registration area A geographic area for which mortality data are published. 
decision analysis A derivative of operations research and game theory that involves 
identifying all available choices and potential outcomes of each, in a series of deci¬ 
sions that have to be made about aspects or patient care—diagnostic procedures, 
therapeutic regimens, prognostic expectations. Epidemiologic data play a large part 
in determining the probabilities of outcomes following each choice that has to be 
made. The range of choices can be plotted on a decision tree, and at each branch, 
or decision node, the probabilities of each outcome that can be predicted are dis¬ 
played. The decision tree thus portrays the choices available to those responsible 
for patient care and the probabilities of each outcome that will follow the choice of 
a particular action or strategy' in patient care. The relative worth of each outcome 
is preferably also described as a utility or quality of life, e.g., a probability of life 
expectancy or of freedom from disability. 1 
1 Paukrr SC. KassirerJP: Decision analysis. N Eng!J Med 316:250-258. 1987. 
decision tree The alternative choices expressed in quantitative terms, available at each 
stage in the process of thinking through a problem, may be likened to branches, 
and the hierarchical sequence of options, to a tree. Hence, decision tree. It is a 
graphic device used in decision analysis, in which a scries of decision options are 
represented as branches and subsequent possible outcomes are represented as fur¬ 
ther branches. The decisions and the eventualities are presented in the order they 
are likely to occur. The junction where a decision must be taken is called a decision 
node. 

deduction Reasoned argument proceeding from the general to the particular. 


Source: https://www.industrydocuments.ucsf.edu/docs/kypxOOOO 
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degrees of freedom 

DECREES or FREEDOM (df) The number of independent comparisons lhai can be made 
between the members of a sample. This important concept in statistical testing can¬ 
not be defined briefly. It refers to the number of independent contributions to a 
sampling distribution (such as j'*, (, and F distribution). In a contingency table it 
is one less than the number of row categories multiplied by one less than the num¬ 
ber of column categories. 

DEMAND (TOR health services) Willingness and/or ability to seek, use, and, in some 
settings, to pay for services. Sometimes further subdivided into expressed demand 
(equated with use) and potential demand, or NEED. 

DEMOGRAPHIC transition The transition from high to low fertility and mortality rates, 
ttsuallv related to technological change and industrialization. 

DEMOGRAPHY The study of populations, especially with reference to size and density, 
fertility, mortality, growth, age distribution, migration, and vital statistics, and 
the interaction oi all these with social and economic conditions. 

demonstration MODEL An experimental health care facility, program, or svstem with 
built-in provision for measuring aspects such as costs per unit of service, rates of 
use bv patients or clients, and outcomes of encounters between providers and users. 
The aim usually is to determine the feasibility, efficacy, effectiveness, and/or effi¬ 
ciency of the model service. 

denominator The lower portion of a fraction used to calculate a rate or ratio. The 
population (or population experience, as in person-years, passenger-miles, etc.) at 
risk in the calculation of a rate or ratio. See also numerator. 

density of population Demographic term meaning numbers of persons in relation to 
available space 

density sampling A method of selecting controls in a case control study in winch 
cases are sampled only from incident cases over a specific time period, and controls 
are sampled and interviewed throughout that period (rather than simply at one 
point in time, such as the end of the period). This method can reduce bias due to 
changing exposure patterns in the source population. 

dependency ratio Proportion of children and old people in a population in compari¬ 
son to all others, i.e., the proportion of economically inactive to economically active; 
’children” ate usually defined as ages under 15 and “old people” as ages 65 and 
over. 

DEPENDENT VARIABLE 

I A variable the value of which is dependent on the effect of other variable(s) 
(independent variable(s)l in the relationship under study. A manifestation or 
outcome whose variation we seek to explain or account for by the influence of 
independent variables. 

2. In statistics, the dependent variable is the one predicted by a regression equa¬ 
tion. 

See also independent variable. 

descriptive study A study concerned with and designed only to describe the existing 
distribution of variables, without regard to causal or other hypotheses. Contrast 
analytic study. An example is a community health survey, used to determine the 
health status of the people in a community. Descriptive studies, e.g., analyses of 
cancer registry data, can be used to measure risks. 

design See research design. 

DESIGN VARIABLE 

I. A study variable whose distribution in the subjects is determined by the inves¬ 
tigator. 
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2. In statistics, a variable taking on the value I to indicate membership in a par¬ 
ticular category and 0 or - I to indicate nonmembership in the category. Used 
primarily in analysis or variance. 

determinant Anv factor, whether event, characteristic, or other definable entity, that 
brings about change in a health condition, or other defined characteristic. See also 
causality, factors in. 

diagnosis The process of determining health status and the factors responsible Tor 
producing it; may be applied to an individual, family, group, or community. The 
term is applied both to the process of determination and to its findings. See also 
DISEASE LABEL. 

diagnostic index A system for recording diagnoses, diseases, or problems or patients 
or diems in a medical practice or service, usually including identifying information 
(name, birthdate, sex) and dates of encounters. See also e-book. 

differential The difference(s) shown in tabulation of health and vital statistics ac¬ 
cording to age, sex, or some other factor; age differentials are the differences re¬ 
vealed in the tabulations of rates in age-groups, sex differentials are the differences 
in rates between males and females, income differentials are differences between 
designated income categories, etc. 

digit preference A preference for certain numbers that leads to rounding off mea¬ 
surements. Rounding off may be to the nearest whole number, even numlier, mul¬ 
tiple oF 5 or 10, or (when time units like a week are involved) 7, 14, etc. This can 
lie a form of observer variation, or an attribute of respondent^) in a survey. 

dimensionality The number of dimensions, i.e., scalar quantities, needed for accurate 
description of an element of a vector space. 

DIRECT ADJUSTMENT, DIRECT STANDARDIZATION See STANDARDIZATION. 

disability Tcm|>orary or long-term reduction of a person s capacity to function in so¬ 
ciety. See also international classification or impairments, disabilities, and 
handicaps for the official WHO definition. 

discordant A term used in twin studies to describe a twin pair in which one twin 
exhibits a certain trait and the other does not. Also used in matched pair case 
control studies to describe a pair whose members had different exposures to the 
risk factor under study. Only the discordant pairs are informative aliout the asso¬ 
ciation between exposure and disease. 

discrete data Data that can be arranged into naturally occurring or arbitrarily se¬ 
lected groups or sets of values, as opposed to data in which there are no naturally 
occurring breaks in continuity, i.e., continuous data. An example is number of 
decayed, missing, and filled teeth (DMF). 

discriminant analysis A statistical analytic technique used with discrete dependent 
variables, concerned with separating sets of observed values and allocating new val¬ 
ues; can sometimes be used instead of regression analysis. Kendall and Buckland* 
refer to this as ‘‘discriminatory analysis” and describe it as a rule for allocating 
individuals or values from two or more discrete populations to the correct popula¬ 
tion with minimal probability of misclassificalion. 

' Kendall MG, Ruckland WR: A Dnttonan of Slatuttial Trrmi , 4ih cd. London: Longman, 1982 

Disease Literally, du~ea$e, the opposite of ease, when something is wrong with a bodily 
function. The words “disease,” “illness,” and “sickness” are loosely interchangeable, 
but are better regarded as not wholly synonymous. M. W. Susser has suggested that 
they be used as follows: 

Disease is a pilysiologicaI/psychological dysfunction. 

Illness is a subjective slate of the person who feels aware of not being well; 


Source: https://www.industrydocuments.ucsf.edu/docs/kypxOOOO 
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Sickness is a suit of social dysfunction, i.c., a role that the individual assumes 
when ill. 

DISEASE FREQUENCY SURVEY See CROSS-SECTIONAL STUDY; MORBITITY SURVEY. 

disease label The identity of the condition from which a palient suffers. It may be 
the name of a precisely defined disorder identified by a battery of tests, a probabil¬ 
ity statement based on consideration of what is most likely among several possibli- 
ties, or an opinion based on pattern recognition. Use of the word "label" can convey 
stigma, so this term should be used with care, if at all. See also diagnosis. 

DISEASE ODDS RATIO See OODS RATIO. 

disease, preclinical Disease with no signs or symptoms, because they have not yel 
developed. Sec also in apparent infection. 

DISEASE REGISTRY See REGISTER. REGISTRY. 

disease, suBCUNtCAL A condition in which disease is delectable by special tests but does 
not reveal itself by signs or symptoms. 

DISEASE TAXONOMY See TAXONOMY OF DISEASE. 

Disinfection Killing of infectious agents outside the body by direct exposure to chem¬ 
ical or physical agents. 

Concurrent disinfection is the application of disinfective measures as soon as pos¬ 
sible after the discharge of infectious material from the body of an infected person, 
or after the soiling of articles with such infectious discharges, all personal contact 
with such discharges or articles being minimized prior to such disinfection. 

Terminal disinfection is the application of disinfective measures after the patient 
has been removed by death or to a hospital, or has ceased to be a source of infec¬ 
tion, or alter other hospital isolation practices have been discontinued. Terminal 
disinfection is rarely practiced; terminal cleaning generally suffices, along with air¬ 
ing and sunning of rooms, furniture, and bedding. Disinfection is necessary only 
for diseases spread by indirect contact; steam sterilization or incineration of bed¬ 
ding and other items is desirable after a disease such as plague or anthrax. 1 

1 Bcncnson AS (Ed): Control of Commumtabit Dutasei in Man, Hlh ed. Washington DC: American 

Public Health Association 1985. 

disinfestation Any physical or chemical process serving to destroy or remove unde¬ 
sired small animal forms, particularly arthropods or rodents, present upon the per¬ 
son, the clothing, or in the environment of an individual, or on domestic animals. 
Disinfestation includes deiousing for infestation with Peduxtlus humanus humanus , 
the body louse. Synonyms include the terms '‘disinsection'' and "disinsectization’' 
when insects only are involved. 

distribution The complete summary of the frequencies of the values or categories of 
a measurement made on a group of persons. The distribution tells either how many 
or what proportion of the group was found to have each value (or each range of 
values) out of all the possible values that the quantitative measure can have. 

distribution-free method A method which does not depend upon the form of the 
underlying distribution. 

distribution function A function that gives the relative frequency with which a ran¬ 
dom variable falls at or below each of a series of values. Examples include the 
normal distribution, log-normal distribution, chi-square distribution, t distribution, 
/^distribution, and binomial distribution, all of which have applications in epide¬ 
miology. 

DMF The abbreviation DMF stands for decayed, missing, and filled teeth. Lowercase 
letters, i.e., dmf. are used for deciduous dentition, upper case for permanent teeth. 
The DMF number is widely used in dental epidemiology. 
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dose-response relationship A relationship in which a change in amount, intensity, 
or duration of exposure is associated with a change—-either an increase or a de¬ 
crease—in risk of a specified outcome. 

double-blind trial A procedure of blind assignment to study and control groups and 
blind assessment of outcome, designed to ensure that ascertainment of outcome is 
not biased by knowledge of the gToup to which an individual was assigned. "Dou- 
hle” refers to both parties, i.e., the observer(s) in contact with the subjects, and the 
subjects in the study and control groups. See also bund experiment; randomized 
controlled trial. 

drift See genetic drift; social drift. 

droplet nuclei A type of particle implicated in the spread of airborne infection. Droplet 
nuclei are tinv particles (I-10 fim diameter) that represent the dried residue of 
droplets. They may be formed by (1) evaporation of droplets coughed or sneezed 
into the air or (2) aerosolization of infective materials. See also transmission of 

INFECTION. 

DROPOUT A person enrolled in a study who becomes inaccessible or ineligible for fol¬ 
low-up, e.g., because of inability or unwillingness to remain enrolled in the study. 
The occurrence of dropouts can lead to biases in study results. 

DUMMY VARIABLE See INDICATOR VARIABLE. 

dynamic population A population that gains and loses members; all natural popula¬ 
tions are dynamic, a fact recognized by the term "population dynamics,” used by 
demographers to denote changing composition. See also population dynamics; 

STABLE POPULATION. 


Source: https://www.industrydocuments.ucsf.edu/docs/kypxOOOO 
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EARLY warning system In disease surveillance, a specific procedure to detect as early 
as possible any departure from usual or normally observed frequency of phenom¬ 
ena. For example, the routine monitoring of numbers of deaths from pneumonia 
and influenza in large American cities is an early warning system for the identifi¬ 
cation of influenza epidemics. In developing countries, a change in children's av¬ 
erage weights is an early warning signal of nutritional deficiency. 

E-book Method (developed bv Eimerl) 1 of recording encounters in primary medical 
care: encounters are arranged by problem or diagnostic category, thus making it 
easv to count the number of persons seen (and the number of limes each is seen) 
according to problem or diagnostic category in a given period of time. Widely used 
in epidemiologic studies of primary medical care. See also ace-sex register; diag¬ 
nostic index. 

1 timer! TS; Organized curiosity. J Col! Grn Profit! 5:246—252, 1960. 

ecological analysis Analysis based on aggregated or grouped data; errors in infer¬ 
ence mas result Ixrcause associations may be artifaclually created or masked by the 
aggregation process. 

ecological correlation A correlation in w'hich the units studied are populations rather 
than individuals. Correlations found in this manner may not hold true for the in¬ 
dividual members of these populations. See also ecological fallacy. 

ecological FALLACY (5yn: aggregation bias, ecological bias) 

1. The bias that may occur because an association observed between variables on 
an aggregate level does not necessarily represent the association that exists at 
an individual level. 

2. An error in inference due to failure to distinguish between different levels of 
organization. A correlation between variables based on group (ecological) 
characteristics is not necessarily reproduced between variables based on indi¬ 
vidual characteristics; an association at one level may disappear at another, or 
even be reversed. Example: At the ecological level, a correlation has been found 
in several studies between the quality of drinking water and mortality rates 
from heart disease; it would be an ecological fallacy to infer from this alone 
that exposure to water of a particular level of hardness necessarily influences 
the individual’s chances of getting or dying of heart disease. 

ecological study A study in which the units of analysis are populations or groups of 
people, rather than individuals. An example is the study of association between 
median income and cancer mortality rates in administrative jurisdictions such as 
states and counties. 

ECOLOGY The study of the relationships among living organisms and their environ¬ 
ment. -‘Human ecology’* means the study of human groups as influenced by envi¬ 
ronmental factors, often including social and behavioral factors. 
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ECOSYSTEM The plant and animal life or a region considered in relation to the environ¬ 
mental factors that influence it; more specifically, the fundamental unit in ecology, 
comprising the living organisms and the nonliving elements that interact in a de¬ 
fined region. 

effect The result of a cause. In epidemiology, frequently a synonym for effect mea¬ 
sure. 

effectiveness The extent to which a specific intervention, procedure, regimen, or ser¬ 
vice, when deployed in the field, docs what it is intended to do for a defined pop¬ 
ulation. 

effect measure A quantity that measures the effect of a factor on the frequency or 
risk of a health outcome. Three such measures are attributable fractions, which 
measure the fraction of cases due to a factor; risk and rate differences, which mea¬ 
sure the amount a factor adds to the risk or rate or a disease; and risk and rate 
ratios, which measure the amount by which a factor multiplies the risk or rate of 
disease. 

effect modifier (Syn: conditional variable, moderator variable) A factor that modifies 
the effect of a putative causal factor under study. For example, age is an effect 
modifier for many conditions, and immunization status is an effect modifier for the 
consequences of exposure to pathogenic organisms. Effect modification is detected 
bv varying the selected effect measure for the factor under study across levels of 
another factor. See also causality, factors in; interaction. 

effective sample size Sample size after dropouts, deaths, and other specified exclu¬ 
sions from an original sample. 

efficacy 1 he extent to which a specific intervention, procedure, regimen, or service 
produces a beneficial result under ideal conditions. Ideally, the determination of 
efficacy is based on the results of a randomized controlled trial 
efficiency 

1. The effects or end-results achieved in relation to the effon expended in terms 
of money, resources, and time. The extent to which the resources used to 
provide a specific intervention, procedure, regimen, or service of known effi¬ 
cacy and effectiveness are minimized. A measure of the economy (or cost in 
resources) with which a procedure of known efficacy and effectiveness is car¬ 
ried out. 

2. In statistics, the relative precision with which a particular study design or es¬ 
timator wij| estimate a parameter of interest. 

egg count See worm count. 

elimination See eradication (or disease). 

empirical Based directly on experience, e g., observation or experiment, rather than 
on reasoning alone. 

encounter A face-to-face transaction between a personal health worker and a patient 
or client. 

endemic disease The constant presence or a disease or infectious agent within a given 
geographic area or population group; may also refer to the usual prevalence of a 
given disease within such area or group. See also holoendemic disease; hyperen- 

DEMIC DISEASE. 

end results See OUTCOMES. 

environment All that which is external to the individual human host. Can be divided 
into physical, biological, social, cultural, etc., any or all of which can influence health 
status of populations. 
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epidemic [from the Greek epi (upon), dtoos (people)) The occurrence in a community 
or region of cases of an illness, specific health-related behavior, or other health- 
related events clearly in excess of normal expectancy. The community or region, 
and the period in which the cases occur, are specified precisely. The number of 
cases indicating the presence of an epidemic varies according to the agent, size, and 
type of population exposed, previous experience or lack of exposure to the disease, 
and time and place of occurrence; epidemicity is thus relative to usual frequency or 
the disease in the same area, among the specified population, at the same season or 
the year. A single case or a communicable disease long absent from a population or 
first invasion by a disease not previously recognized in that area requires immediate 
reporting and full field investigation; two cases of such a disease associated in lime 
and place may be sufficient evidence to be considered an epidemic. 

The word may be used also to describe outbreaks of disease in animal or plant 
populations. See also epizootic; epornithic. 
epidemic, common source (Syn: common vehicle epidemic, holomiantic disease) Out¬ 
break due to exposure of a group of persons to a noxious influence that is common 
to the individuals in the group. When the exposure is brief and essentially simul¬ 
taneous, the resultant cases all develop within one incubation period of the disease 
(a ‘ point’’ or "point source” epidemic). 

The term "holomiantic disease" was used by Stallvbrass (1931) to describe out¬ 
breaks of this type, but as with several other terms created from Greek or Latin 
roots, transmission to epidemiologists who lacked a classical education, did not lake 
place. 

epidemic curve A graphic plotting of the distribution of cases by time of onset. 
epidemic, mathematical model or See mathematical model, 
epidemic, point source See epidemic, common source. 

epidemiologist An investigator who studies the occurrence of disease or other health- 
related conditions or events in defined populations. The control of disease in pop¬ 
ulations is often also considered to be a task for the epidemiologist, especially in 
speaking of certain specialized fields snch as malaria epidemiology. Epidemiologists 
mav study disease in populations of animals and plants, as well as among human 
populations. See also clinical epidemiologist, 
epidemiology The study of the distribution and determinants of health-related stales 
or events in specified populations, and the application of this study to control of 
health problems. 

There have been many definitions of epidemiology. In the past 50 years or so, 
the definition has broadened from concern with communicable disease epidemics 
to take in all phenomena related to health in populations. 

The Oxford English Dictionary (OED) gives as a definition: "That branch or medical 
science which treats of epidemics” and cites Parkin (1873) as a source. However, 
there was a "London Epidemiological Society" in the 1850s. The identity of the 
scholar who first used the word at that time has been lost. Epuirmtologui appears in 
the title of a Spanish history of epidemics, EpuUmtolopa npoHola, Madrid. 1802. 

Eptdfmu: is much older. The word appears in Johnson s Dictionary (1775), and 
OED gives a citation dated 1603. The word was, of course, used by Hippocrates. 

EPIDEMIOLOGY, ANALYTIC See ANALYTIC STUDY. 

epidemiology, DESCRIPTIVE Study of the occurrence of disease or other health : related 
characteristics in human populations. General observations concerning the relation¬ 
ship of disease to basic characteristics such as age, sex, race, occupation, and social 
class; also concerned with geographic location. The major characteristics in descrip- 
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live epidemiology can be classified under the headings: persons, place, and Lime. 
See also observational study. 

EPIDEMIOLOGY, EXPERIMENTAL See EXPERIMENTAL EPIDEMIOLOGY. 
episode Period in which a health problem or illness exists, from its onset to its resolu: 
lion. Sec also encounter. 

epizootic An outbreak (epidemic) of disease in an animal population (often with the 
implication that it may also affect human populations). 
epornithic An outbreak (epidemic) of disease in a bird population. 
eradication (of diseajie) Termination of all transmission of infection by extermina¬ 
tion of the infectious agent through surveillance and containment. Eradication, as 
in the instance of smallpox, was based on the joint activities of control and surveil¬ 
lance. Regional eradication has been successful with malaria and in some countries 
appears close to succeeding for measles. The term "elimination” is sometimes used 
to describe eradication of diseases such as measles from a large geographic region 
or political jurisdiction. 

ERROR 

1. A false or mistaken result obtained in a study or experiment. Several kinds of 
error can occur in epidemiology, for example, due to bias. 

2. Random error is the portion of variation in a measurement that has no ap¬ 
parent connection to any other measurement or variable, generally regarded 
as due to chance. 

3. Systematic error, which often has a recognizable source, e.g., a faulty measur¬ 
ing instrument, or pattern, e g., it is consistently wrong in a particular direc¬ 
tion. Sec also bias. 

error, type I (Syn: alpha error) The error or rejecting a true null hypothesis. See also 
SIGNIFICANCE LEVEL; STATISTICAL TEST. 

error, type II (Syn: beta error) The error of failing to reject a false null hypothesis. 
See also power; statistical test. 

estimate A measurement or a statement about the value of some quantity is said to be 
an estimate if it is known, believed, or suspected to incorporate some degTcc of 
error. 

estimator In statistics, a function for computing estimates of a parameter from ob¬ 
served data. 

ethics The branch of philosophy that deals with the distinction between right and 
wrong, with the moral consequences of human actions Ethical principles govern 
the conduct of epidemiology, as they do all human activities: the ethical issues that 
are specific to epidemiological practice and research include informed consent, con¬ 
fidentiality, and respect for human rights. The issues have been defined, described, 
and discussed by many writers and by special committees under the auspices of 
research granting agencies and other official bodies in many countries. 1 
'See, for example, the following: Curran WJ: Protecting confidentiality in epidemiologic investi¬ 
gations by the Centers for Disease Control. N EnglJ Med 3M: 1027- 1028, 1986. 

Susser MW, Stein Z, Kline J: Ethics in epidemiology. Ann Amn Acad Pol Soc Sn 437:128- H 1, 1978. 
Commonwealth of Australia. National Health and Medical Research Council. Medical Research 
Ethics Committee: Repori on Eihks in Epidemiological Research. Canberra. 1985. 

Stolley PD: Eaith, evidence and the epidemiologist. J Public Health Pot 6:37-42, 1985. 

Gordis, L, Gold E. Seliser R: Privacy and protection in epidemiologic and medical research: Chal¬ 
lenge and responsibility. Am J F.pulnmol 105:163-168, 1977. 

National Academy of Sciences. Institute of Medicine: Ethics of Health Care. Washington, DC. 1974. 
Tancredi LR (ed): Ethical issues in epidemiologic research (Vol VII, series in Psychosocial Epide¬ 
miology). New Brunswick, NJ: Rutgers University Press. 1986. 
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ETHNIC group A social group characterised by a distinctive social and cultural tradition, 
maintained within the group from generation to generation, a common history and 
origin, and a sense of identification with the group. Members of the group have 
distinctive features in their way of life, shared experiences, and often a common 
genetic heritage. These features may be reflected in their health and disease expe¬ 
rience. See also race. 

etiology Literally, the science of causes, causality; in common usage, cause. See also 
causality; pathogenesis. 

ETIOLOCIC FRACTION (EXPOSED) See ATTRIBUTABLE FRACTION (EXPOSED). 

ETIOLOGIC FRACTION (POPULATION) See ATTRIBUTABLE FRACTION (POPULATION). 

evaluation A process that attempts to determine as sysienutically and objectively as 
. possible the relevance, effectiveness, and impact of activities in the light of their 
objectives. Several varieties of evaluation can be distinguished, e.g., evaluation of 
structure, process, and outcome. See also clinical trial; effectiveness; efficacy; 
efficiency: health services research; program evaluation and review tech¬ 
niques; QUALITY or CARE. 

Evan's postulates Expanding biomedical knowledge has led to revision of henlk’s 
and kogh's postulates. Alfred Evans 1 developed those that follow, based on the 
Hcnle-koch model. 

|, Prevalence of the disease should be significantly higher in those exposed to 
the hypothesized cause than in controls not so exposed. 

2. Exposure to the hypothesized cause should be more frequent among those 
with the disease than in controls without the disease—when all other risk 
factors are held constant. 

3. Incidence of the disease should be significantly higher in those exposed to 
the hypothesized cause than in those not so exposed, as shown by prospective 
studies. 

4. The disease should follow exposure to the hypothesized causative agent with 
a distribution of incubation periods on a bell*$haped curve- 

5. A spectrum of host responses should follow exposure to the hypothesized 
agent along a logical biological gradient from mild to severe. 

6. A measurable host response following exposure to the hypothesized cause 
should have a high probability of appearing in those lacking this before ex¬ 
posure (e g , antibody, cancer cells), or should increase in magnitude if pres¬ 
ent before exposure. This response pattern should occur infrequently in per¬ 
sons not so exposed. 

7. Experimental reproduction of the disease should occur more frequently in 
animals or man appropriately exposed to the hypothesized cause than in those 
not so exposed; this exposure may be deliberate in volunteers, experimen¬ 
tally induced in the laboratory', or may represent a regulation of natural ex¬ 
posure. 

8. Elimination or modification of the hypothesized cause should decrease the 
incidence of the disease (i.e., attenuation of a virus, removal of tar from 

' cigarettes). 

9. Prevention or modification of the host’s response on exposure to (he hypoth¬ 
esized cause should decrease or eliminate the disease (i.e., immunization, drugs 
to lower cholesterol, specific lymphocyte transfer factor in cancer). 

' 10. All of the relationships and findings should make biological and epidemio¬ 

logic sense. 

‘Evans AS: Causation and disease: The Henle-Koch postulates revisited. Yale J Biol Med 49:175- 
195, 1976. 
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exact method A statistical method based on the actual, i.e., “exact" probability distri¬ 
bution of the study data, rather than on an approximation such as the normal or 
chi-square distribution; for example, Fisher’s exact test. 

exact test A statistical test based on the actual null probability distribution of the 
study data, rather than, say, normal approximation. The most common exact test 
is the Fishcr-lrwin test for fourfold tables. 

EXCESS RATE AMONG EXPOSED See RATE DIFFERENCE. 

excess risk A term sometimes used to refer to the population excess rate and some¬ 
times to RISK DIFFERENCE. 

, EXPANDED PROGRAMME on immunization Part of the effort to achieve "Health for All 
by the Year 2000," under the auspices of WHO, UNICEF, and other international 
and bilateral aid agencies. This is a program of immunizing against diphtheria, 
tetanus, measles, pertussis, poliomyelitis, and tuberculosis, conducted especially in 
developing countries. 

expectation of life (Syn: life expectancy or expectation) The average number of 
years an individual or a given age is expected to live il current mortality rates con¬ 
tinue to apply. A statistical abstraction based on existing, age-specific death rates. 

Life expectancy at birth (At): Average number of years a newborn baby can be ex¬ 
pected lo live if current mortality trends continue. Corresponds to the total number 
of years a given birth cohort can be expected to live, divided by the numl>er of 
children in the cohort. Life expectancy at birth is partly defK-ndem on mortality in 
the first year of life and is lower in poor than in rich countries because of the higher 
infant and child mortality rates in the former. 

Life expectancy at a gn>m age, age x (Vj The average number of additional years a 
person age x would live if current mortality trends continue to apply, based on the 
age-specific death rates for a given year. 

Life expectancy is a hypothetical measure and indicator of current health and 
mortality conditions. It is not a rate. 

experiment A study in which the investigator intentionally alters one or more factors 
under controlled conditions in order lo study the effects of so doing. 

experimental epidemiology In modern usage, this term is often equated with ran¬ 
domized controlled TRIALS. To greenwood and other epidemiologists in the !920s, 
it meant the study of epidemics among colonies of experimental animals such as 
rats and mice The original meaning of the term is preferable; if the word ‘ exper¬ 
iment’’ is qualified by the adjective "epidemiologic" it is a synonym for randomized 

CONTROLLED TRIAL. See also ANIMAL MODEL. 

experimental study A study in which conditions are under the direct control of the 
investigator. In epidemiology, a study in which a population is selected Tor a planned 
trial of a regimen whose effects are measured by comparing the outcome of the 
regimen in the experimental group with the outcome of another regimen in a con¬ 
trol group. To avoid bias members of the experimental and control groups should 
be comparable except in the regimen that is offered them. Allocation of individuals 
to experimental or control groups is ideally by randomization. In a randomized 
controlled trial, individuals are randomly allocated; in some experiments, e g., 
fluoridation of drinking water, whole communities have been (nonrandomly) allo¬ 
cated lo ex|>erimental and control groups. 

EXPLANATORY studv A study whose main objective is to explain, rather than merely 
describe, a situation, by isolating the effects or specific variables and understanding 
the mechanisms of action. See also pragmatic study. 
i EXPLANATORY variable 

I. A variable that causally explains the association or outcome under study. 
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exposed 

2. In statistics, a synonym for independent variable. 

exposed In epidemiology, the exposed group (or simply, the exposed) is often used to 
connote a group whose members have been exposed to a supposed cause of a dis¬ 
ease or health state of interest, or possess a characteristic that is a determinant of 
the health outcome of interest. 

EXPOSURE 

1. Proximity and/or contact with a source of a disease agent in such a manner 
that effective transmission of the agent or harmful effects of the agent may 
occur. 

2. The amount of a factor to which a group or individual was exposed; some¬ 
times contrasted with dose, the amount that enters or interacts with the orga¬ 
nism. 

3. Exposures may of course be beneficial rather than harmful, e g., exposure to 
immunizing agents. 

EXPOSURE-ODDS RATIO See ODDS RATIO. 

exposure ratio The ratio of rates at which persons in the case and control groups of 
a case control study are exposed to the risk factor (or to the protective factor) 
of interest. 

ExpRESsrvmr In genetics, the extent to which a gene is expressed. 

extrapolate, extrapolation To predict the value of a variate outside the range of 
observations; the resulting prediction. See also interpolate. 

Extrinsic incubation PErjod Time required for development of a disease agent in a 
vector from the time of uptake of the agent to the time when the vector is infective. 
See also incubation period; vector-borne infection. 
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F distribution (Syn: Variance ratio distribution) The distribution of the ratio of two 
independent quantities each of which is distributed tike a variance in normally dis¬ 
tributed samples. So-named in honor of R.A. Fisher who first described this distri¬ 
bution. 

F| ( M F one”) Term used in genetics to describe first-generation progeny of a mating. 
pactor (Syn; determinant) 

1. An event, characteristic, or other definable entity that brings about a change 
in a health condition or other defined outcome. See also causality, causa¬ 
tion OF DISEASE, FACTORS IN. 

2. A synonym for (categorical) independent variable, or more precisely, an in¬ 
dependent variable used to identify, with numerical codes, membership of 
qualitatively different groups. A causal role may be implied, as in "overcrowd¬ 
ing is a factor in disease transmission" where overcrowding represents the 
highest level of the factor "crowding." 

factor analysis A set of statistical methods for analyzing the correlations among sev¬ 
eral variables in order to estimate the number of fundamental dimensions that un¬ 
derlie the observed data and to describe and measure those dimensions. Used fre¬ 
quently in the development of scoring systems for rating scales and questionnaires. 
factorial design A method of setting up an experiment or study to assure that all 
levels of each intervention or classificatory factor occur with all levels of the others. 
false necative Negative test result in a subject who possesses the attribute for which 
the test is conducted. The labeling of a diseased person as healthy when screening 
in the detection of disease. Sec also screening; sensitivity and specificity, 
false positive Positive test result in a subject who does not possess the attribute for 
which the test is conducted. The labeling of a healthy person as diseased when 
screening in the detection of disease. See also screening; sensitivity and specific¬ 
ity. 

familial disease Disease that exhibits a tendency to familial occurrence. Familial oc¬ 
currence of disease may be due to genetic transmission, intrafamilial transmission 
of infection or culture, interaction within the family, or the family’s shared experi¬ 
ence, including its exposure to a common environment. 
family A group of two or more persons united by blood, adoptive or marital ties, or 
the common law equivalent; the family may include members who do not share the 
household but are united to other members by blood, adoptive or marital, or equiv¬ 
alent ties. Epidemiologic studies may be concerned with family members or with 
those who share the same household or dwelling unit. 
family, extended A group of persons comprising members of several generations united 
by blood, adoptive and marital, or equivalent lies. See also family, nuclear. 
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family contact disease Disease that occurs among members of the family of a worker 
who is exposed to a toxic substance and carries this home on his person or his 
clothing, causing exposure to other family members, 

family, nuclear A group of persons comprising members of a single or at most two 
generations, usually husband-wife-children, united by blood or adoptive and mar* 
iul or equivalent lies. 

family of classifications In nosology, a set of related classification systems describ¬ 
ing different aspects of health problems. For example, the International Classifica¬ 
tion of Disease, the International Classification of Health Problems in Primary Care, 
the International Classification of Impairments, Disabilities and Handicaps, and the 
specialty subclassificaiiom for oncology , psychiatry, etc, developed by WHO work¬ 
ing groups constitute a ' family of classifications.*’ 

FAMILY STUDY An epidemiologic study of a family or a group or families. The term has 
been used to describe surveillance of family groups, e.g., for tuberculosis. In ge¬ 
netics, investigation of families showing an unusual characteristic in order to deter¬ 
mine whether the characteristic clusters in certain families and if so. why. 

Farr, William (1807-1883) A medical graduate who became the first compiler of ab¬ 
stracts (statistician) to the Registrar-General in the newly established General Reg¬ 
ister Office of England in 1839 and remained there for more than 40 years. In his 
Annual Rrports, the combination of facts on death rates and vivid language drew 
attention to many inequalities oT health and sickness experience between * healthy 
and -’unhealthy" districts in England. His many contributions to vital statistics and 
epidemiology are contained in his monograph Vital Statutes (London, 1885). These 
include a statement of the relationship between incidence and prevalence, the con¬ 
cepts of person-years, retrospective and prospective approaches, observed and ex¬ 
pected numbers of events, the first workable nosology, and empirical laws about 
the natural history of epidemics. 

fatality rate The death rale observed in a designated series of persons affected by a 
simultaneous event, e.g., victims of a disaster. A term to be deprecated, Ijecause it 
can be confused with case fatality rate. 

feasibility study Preliminary study to determine practicability of a proposed health 
program or procedure, or of a larger study, and to appraise the factors that may 
influence its practicability. See also pilot study. 

fecundity The ability to produce live oflspring. Fecundity is difficult to measure since 
it refers to the theoretical ability of a woman to conceive and carry a fetus to term. 
If a woman produces a live birth, it is known that she and her consort were fecund 
during some time in the past. 

fertility The actual production of live offspring. Stillbirths, fetal deaths, and abor¬ 
tions are not included in the measurement of fertility in a population. See also 
gravidity; parity. 

fertility rate See general fertility rate. 

fertility ratio A measure of the fertility of the population that restricts the denom¬ 
inator to the female population of appropriate age for childbearing. The fertility 
ratio is defined as 

Number of girls under 15 years of age 

Fertility ratio - —-—-:— t —- * 1000 

Number of women tn 15-49 age group 

(Not to be confused with general fertility rate.) 

fetal death (Syn: stillbirth) Death prior to the complete expulsion or extraction from 
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its mother of a product of conception, irrespective of the duration of pregnancy. 
The death is indicated by the fact that after such separation the fetus does not 
breathe or show any other evidence of life, such as beating of the heart, pulsation 
of the umbilical cord, or definite movement of voluntary muscles. Defined variously 
as death after the 20th or 28th week of gestation (the definition oT the length of 
gestation varies between different jurisdictions, making this event difficult to com¬ 
pare internationally). See also live birth. 

fetal death certificate (Syn: certificaic of stillbirth) A vital record registering a fetal 
death or stillbirth. Some health jurisdictions require the use of a fetal death certif¬ 
icate for all products of conception, whereas others require its use only in cases in 
which gestation has reached a particular duration, usually the 20th or the 28th 
week. 

fetal death rate (Syn: stillbirth rate) The number of fetal deaths in a year expressed 
as a proportion of the total number of births (live births plus fetal deaths) in the 
same year. 


Feial dea.h ra.c « Number of fetal dea.h, in a year 
Number of fetal deaths plus live 
births in the same year 


x 1000 


Note that the denominator is larger than for the fetal death ratio and that the 
fetal death rale is therefore lower than the fetal death ratio, which is used in some 
jurisdictions. International comparisons of stillbirth or fetal death statistics will be 
Hawed if the distinction is not appreciated. 

fetal death ratio A measure of feta! wastage, related to the number of live births. 
Defined as 


r , . Number of fetal deaths in a vear 

Fetal death ratio *? ----- - - - • _ 

Number ol live births in the same vear 

(Can be expressed per 1000.) 

FIELD SURVEY The planned collection of data in "the field." i.e., usually among nonin- 
stitutionaliied persons in the general population. A method of establishing a rela¬ 
tionship between two or more variables in a population in numerical terms bv elic¬ 
iting and collating information from existing sources (not only records but people 
who can say bow they feel or what happened). See also gross-sectional study. 
Finlay, Carlos Albert (1833-1915) Cuban physician, initial investigator (1888-1891) 
of the role of Atilt* <wgYpfi (then known as Culrx fasnaius) in the transmission of 
yellow fever. His experiments were unsatisfactory, but his theory was fully con¬ 
firmed by the experiments of the team led by Reed in which he look an active pan. 
Fisher’s exact test The test for association in a two-by-two table that is based upon 
the exact hyper geometric distribution of the frequencies within the table 
fishing EXPEDITION Exploratory study to find dues and leads for further study. Ah 
though the term is sometimes used pejoratively, "fishing expeditions” may be done 
for worthwhile causes, e.g., to seek dues to the cause of a major life-threatening 
outbreak. A recent example was the initial investigation of Legionnaires’ disease. 
fitness This word has specific meanings in several fields related io epidemiology. 

* population genetics, a measure of the relative survival and reproductive 
success of a given individual or phenotype, or population subgroup. 

2. In health promotion, health risk appraisal, physical fitness is a set of attributes 
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function 


that people have or achieve, that relate to their ability to perform physical 
activity, intellectual and emotional fitness can also be described and to some 
extent measured. 

fixed cohort A cohort in which membership is fixed by being present at some defin- 
ing event (“zero time"); an example is the cohort comprising survivors of the atomic 
bomb exploded at Hiroshima. See also closed cohort. 

follow-uf Observation over a period of lime of an individual, group, or initially de¬ 
fined population whose appropriate characteristics have been assessed in order to 
observe changes in health status or health-related variables. See also cohort. 

FOLLOW-UF STUDY 

1. A study in which individuals or populations, selected on the basis of whether 
they have been exposed to risk, received a specified preventive or therapeutic 
procedure, or possess a certain characteristic, are followed to assess the out¬ 
come of exposure, the procedure, or effect of the characteristic, e g., occur¬ 
rence of disease. 

2. Svnonvm for cohort studv. 

FOMiTES (singular, fomes) Articles that convey infection to others because they have 
been contaminated by pathogenic organisms. Examples include handkerchief, 
drinking glass, door handle, clothing, and toys. 

roRCC or morbidity (Syn; hazard rate, instantaneous incidence density, instantaneous 
incidence rate, person-time incidence rate) Theoretical measure of the number of 
new cases that occur per unit of population-time, e.g., person-years at risk. This is 
a measure of the occurrence of disease at a point in time, t, defined mathematically 
as (he limit, as A/ approaches zero, of 

Probability that a person well at time t will develop 

_ the disease in the interval t + At _ 

A t 

The average value of this quantity over the interval t to (f + A/) can be estimated as 

_ Incident cases observed from Mo (f + At) 

Number of person-time units of experience observed 
from Mo (/ + At) 

force or mortality (Syn: actuarial death rate) The hazard rale of the occurrence of 
death at a point in time t. i.e., the limit as A/ approaches zero, of the probability 
that an individual alive at time / will die by lime /+ A/, divided by At. Distinct from 
cumulative death rate. 

forecasting A method of estimating what may happen in the future that relies on 
extra pol at ion of existing trends (demographic, epidemiologic, etc,). It may be less 
useful than scenario building, which has greater flexibility. For example, extra¬ 
polation of mortality trends for coronary heart disease in the early 1960s in the 
> United States suggested that the mortality rates would continue to rise, perhaps 
indefinitely, whereas in fact the rales began to fall soon after that lime. 

fortuitous relationship A relationship that occurs by chance and needs no further 
explanation. 

forward survival estimate A procedure for estimating the age distribution at some 
later date by projecting forward an observed age distribution. The procedure uses 
survival ratios, often obtained from model life tables. 


FOURFOLD TABLE See CONTINGENCY TABLE. 

Fracastortus, Girolamo (1484-1553) Physician, poet, natural scientist, and a man of 
legends, said to have required surgery at birth to open fused lips and to have sur¬ 
vived a lightning boll that killed his mother while he was in her arms as an infant 
He gave the word “syphilis** to the world in his mock-heroic poem, Syphtits Siv* 
Morbu< GaUicus (1530), which explicitly described the transmission of disease by acts 
of venery. In Dr Contapont (1546), he described transmission of infection by direct 
contact, by fomites. and "at a distance,” by which he meant droplets. 

FRAMINGHAM study Probably the best known cohort study of heart disease. Since 1949. 
samples of residents of Framingham, Massachusetts, have been subjects of investi¬ 
gations of risk factors in relation to the occurrence or heart disease and later other 
outcomes. 

Fiunk, Johann Pirn* (1745-1821) Author of System nner vollsttndigen med^mischev 
Poltzry, which established hygiene as a systematic science and contained many sug- 
gestions based on epidemiologic observations. In modem terminology. Frank was 
Director-general of public health" to the Hapsburg empire in eighteenth century 
Vienna. His Systrm contained many sensible rules for individual good health and 
detailed specifications for public health practice. 

frequency See occurrence. 

FREQUENCY DISTRIBUTION See DISTRIBUTION. 

FREQUENCY MATCHING See MATCHING. 

r*E%u£NCY roLYCON A graphic illustration of a distribution, made by joining a set of 
points, for each or which the abscissa is the midpoint of the class and the ordinate 
or height, is the frequency. 



Strum cholesterol v*lue |mf tOOmlt 


Frequency polygon. From Rimm et al. t 1980. 

ruNcnoN A quality, trait, or fact that is so related to another as to be dependent upon 
and to vary with this other. 1 
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Galton, Francis (1822-1911) A rounder of the modern science of human biology and 
the inventor of several statistical methods. Perhaps he is best known as the author 
of Hereditary Grmut (1869), an analysis of physical and intellectual characteristics of 
successive generations of several hundred prominent families. Observing that off¬ 
spring of parents of unusual talent, height, etc., tended toward average, he for¬ 
mulated the "Law of filial regression" (the origin of the term '’regression"). His 
statistical approaches were relined and extended by his pupil. Karl Pearson, the 
founder of modern biometry. 

Gaussian distribution See normal distribution. 

game theory A branch of mathematical logic concerned with the range of possible 
reactions to a particular strategy; each reaction can be assigned a probability and 
each reaction can lead to further action bv the "adversary" in the game. Used maintv 
in svstems analysis and such applications as war-gaming, game theory has occasional 
applications in disease surveillance and control. It is also one of the underlying 
theories used in clinical decision analysis. 

gene A sequence of DNA that codes for a particular protein product or that regulates 
other genes. Genes are the biological basis of heredity and occupy precisely defined 
locations on chromosomes. 

gene pool The total of all genes possessed bv reproductive members or a population. 

general FERTILITY rate A more refined measure of fertility than the crude birth rate. 
The denominator is restricted to the number of women of childhearing age (i.c., 
15-44 or 15-49). Defined as 


General fertility rate 


Number of live births in an area 
during a year 

Midyear female population-age 15-44 
in same area in same year 


The upper age limit for this rate is 44 years in most jurisdictions. 
generation EFFECT (Syn: cohort effect) Variation in health status that arises from the 
different causal factors to which each binh cohort (see cohort) in the population 
is exposed as the environment and society change. Each consecutive birth cohort is 
exposed to a unique environment that coincides with its life span. 
generation time The interval between receipt of infection by and maximal infectivity 
of the host. This applies to both clinical cases and inapparent infections. 

With person-_to-person transmission of infection, the interval between cases is de-_ 
terminrd by the generation time. See also incubation period, 
genetic drift Random variation in gene frequency from generation to generation; 
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Gompertz’i law 



most often observed in small populations. The process of evolution through ran¬ 
dom statistical fluctuation of genetic composition of populations. 

Cenetic epidemiology The science that deals with the etiology, distribution and con¬ 
trol of disease in groups of relatives, and with inherited causes of disease in popu¬ 
lations. 1 

’Morion NE: Outline of gmetir epidemiolofry. New York: Kargrr. 1982. 

genetic linkage Particular genes occupy specific sites in chromosomes, one member 
of each pair of chromosomes of course coming from each parent. When two genes 
are fairly dose to each other in the same chromosome pair, they tend to be inher¬ 
ited together. Such genes are said to be linked, and the phenomenon is called ge¬ 
netic linkage. 

cenetic penetrance The extent to which a genetically determined condition is ex¬ 
pressed in an individual. This determines the frequency with which genetic effect 
is shown in a population. 

genetics The branch of biology dealing with heredity and variation of individual 
members of a species. Its branches include population genetics, which overlaps ep¬ 
idemiology; therefore we include pertinent genetic terms in this dictionary. 

genome The array of genes carried by an individual. 

geographic pathology (Svn: medical geography) The comparative study of coun¬ 
tries, or of regions within them, with regard to variations in morbiditv/mortality. 
The (implied) aim of such study is usually to demonstrate that the variations arc 
caused by or related to differences in the geographic environment. 

GEOMETRIC MEAN See MEAN, GEOMETRIC. 

gestational ace Strictly speaking, the gestational age of a fetus is the elapsed time 
since conception. However, as the moment when conception occurred is rarely known 
precisely, the duration of gestation is measured from the first dav of the Iasi normal 
menstrual period. Gestational age is expressed in completed davs or completed weeks 
(e g., events occurring 280-286 days after the onset of the last normal menstrual 
period are considered to have occurred at 40 weeks of gestation). 

Measurements of fetal growth, as they represent continuous variables, are ex: 
pressed in relation to a specific week of gestational age (e g., the mean birth weight 
for 40 weeks is that obtained at 280-286 days of gestation on a weight-for- 
gestaiional age curve). Some specified variations of gestational age are: Preterm: Less 
than 37 completed weeks (less than 259 days). Term: From 37 to less than 42 com¬ 
pleted weeks (259-293 days). PosUerm: Forty-two completed weeks or more (294 
days or more). 

“gold standard” A jargon term, used to describe a method, procedure, or measure¬ 
ment that is widely accepted as being the best available. Often used to compare with 
new methods. 

Goldrercer, Joseph (1874-1927) A U.S. Public Health Service physician. Responsible 
for a brilliant series of investigations of pellagra. After logical deductions led him 
to reject the prevailing view that pellagra had an infectious origin, he conducted 
studies in several rural communities and in institutions, leading conclusively to the 
demonstration that pellagra was a dietary deficiency disease. 

Gompertz's law The proportionate relationship of mortality to age. Mortality is high 
during the first year of life (infancy), drops to its lowest level in childhood, and 
gradually climbs during the third and fourth decade. After age 35 or 40, the in¬ 
crease in mortality with age tends to be logarithmic for the remainder of the life 
span, i.e., the relative increase in mortality in each successive age class (of equal 
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growth rate of population 


size) is about constant. This law was first enunciated by the demographer Benjamin 
Gompertz, on the basis of survival curves in English villages in the 1840s. 
gonadotrophic CYCLE One complete round of ovarian development in the mosquito 
(or other insect vector) from the time when the blood meal is taken to the time 
when the fully developed eggs are laid. 

goodness or nr Degree of agreement between an empirically observed distribution 
and a mathematical or theoretical distribution. 
goodness or rrr test A statistical test of the hypothesis that data have been randomly 
sampled or generated from a population that follows a particular theoretical distri¬ 
bution or model. The most common such tests arc chi-square tests. 
gradient or infection The variety of host responses to infection ranging from inap- 
parent infection to fatal illness. 

graph Visual display of the relationship between variables; the values of one set of 
variables are plotted along the horizontal or x axis, of a second variable, along the 
vertical or y axis. Three-dimensional graphs of relationships between three variables 
can be represented and comprehended visually in two dimensions. The relationship 
between x and y may be linear, exponential, logarithmic, etc. See also axis, abscissa, 
ordinate. "Graph" is also a descriptive term for histograms, bar charts, etc. 


/ 
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Abscissa 


x axis 


Graph showing abscissa, ordinate, and locus of a point, P, 
in relation to x and t axis. 


Graunt, John (1620—1674) By profession a haberdasher, he was a member of the 
small community of scholars and natural scientists in London who were Fellows of 
the Royal Society in its early years and who made important contributions to the 
natural sciences. Graunt studied the bills of mortality and used them to conduct 
the first analytic studies of vital statistics, identifying differences in mortality rates 
between the sexes, between city and country folk, and recording all in Natural and 
political observations mentioned m a following index and made upon the BtUs of Mortality 
(London, 1662). 
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gravidity The number of pregnancies (completed or incomplete) experienced by a 
woman. 

Greenwood, Major (1888-1949) Medical epidemiologist, trained in statistics by Karl 
Pearson; Greenwood was the first professor of epidemiology at the London School 
of Hygiene and Tropical Medicine. He inspired a whole generation of British epi¬ 
demiologists, introducing to the subject a level of mathematical reasoning and sta¬ 
tistical rigor it had not previously known. Author of many papers and several mon¬ 
ographs, best known of which is Epidemics and Crowd Diseases (London, 1933). 
cross reproduction rate The average number of female children a woman would 
have if she survived to the end of her childbearing years and if, throughout that 
period, she were subject to a given set of age-specific fertility rates and a given sex 
ratio at birth. This rate provides a measure of replacement fertility in the absence 
of mortality. See also net reproduction rate, 
growth rate or POPULATION A measure of population growth (in the absence of mi¬ 
gration) comprising addition of newborns to the population and subtraction of deaths. 
The result, known as natural rate of increase, is calculated as 

Live births during the year - deaths during the year 
Midyear population 

Alternatively, il is the difference between crude birth rate and crude death rate. 
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Hackett spleen classification A numerical means of recording the size of an en¬ 
larged spleen, especially in malaria This is a G-point scale of 0 (no enlargement) lo 
5 (enlarged lo umbilicus or larger). See Terminology of Malaria and of Malaria Eradi¬ 
cation. Geneva: WHO. 1963, pp. 40—41. 

HALO EFFECT 

1. The effect (usually beneficial) that the manner, attention, and caring of a 
provider have on a patient during a medical encounter regardless of what 
medical procedures or services the encounter involves. See also placebo, pla¬ 
cebo effect. 

2. The influence upon an observation of the observer's perception of the char¬ 
acteristics ol the individual observed (other than the characteristic under study) 
or the influence of the observer's recollection or knowledge of findings on a 
previous occasion. 

handicap Reduction in a person’s capacity to fulfill a social role as a consequence of an 
impairment, inadequate training for the role, or other circumstances. Applied to 
children, the term usually refers to the presence of an impairment or other circum¬ 
stance that is likely to interfere with normal growth and development or with the 
capacity to learn. See also international classification of impairments, disabil¬ 
ities. and handicaps lor the official WHO definition. 

haphazard sample Selection of a group of persons for siudv without thought as to 
whether they are representative of the population. The word "haphazard" here 
implies selection based on a mixture of criteria such as convenience, accessibility, 
turning up at the time an investigation or study is in progress, and belonging to 
some existing list or registry, etc. Because they have an unknown chance of being 
unrepresentative of the population, haphazard samples are unsatisfactory for gen¬ 
eralization. 

IfARDY-W einberg law The principle that both gene and genotype frequencies will 
remain in equilibrium in an infinitely large population in the absence of mutation, 
migration, selection, and nonrandom mating. If p is the frequency of one allele and 
q is the frequency of another and p+q- I. then p 1 is the frequency of homozygotes 
for the allele, q* is the frequency of homozygotes for the other allele, and 'Ipq is the 
frequency of heterozy gotes. 

HARMONIC MEAN See MEAN, HARMONIC. 

Hawthorne effect The effect (usually positive or beneficial) of being under study 
upon the persons being studied; their knowledge of the study often influences their 
behavior. The name derives from work studies by Whitehead. Dickson, Roelhlis- 
berger, and others, in the Western Electric Plant, Hawthorne, Illinois, reported by 
Elton Mayo in The Social Problems of an induUrtal Cnnltwtwn (London: Roulledge, 
1949). 
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HAZARD A factor or exposure that may adversely affect health. 

hazard rate (Syn: force of morbidity, instantaneous incidence rale) A theoretical 
measure of the risk of occurrence of an event, e.g., death, new disease, at a point 
in lime, t, defined mathematically as the limit, as A l approaches zero, of the proba¬ 
bility that an individual well at lime t will experience the event by f + A/, divided by 
At. 

health The World Health Organization (WHO) described health in the preamble to its 
constitution as. "A state of complete physical, mental, and social well-being and not 
merely the absence of disease or infirmity.” The WHO description of health has 
been criticized because of the difficulty of defining and measuring "complete" 
wellbeing. 

There are several other definitions, including the following: 

A state of dynamic balance in which an individual s or a group's capacity lo cope 
with alt the circumstances of living is at an optimum level. 

A state characterized by anatomical, physiological and psychological integrity, ability 
to perform personalis valued family, work and community roles; ability lo deal with 
physical, biological, psychological and social stress; a feeling of welbbeing; and free¬ 
dom from the risk of disease and untimely death. 

Rene Dubos offered the following definition: "A modus vivendi enabling imper¬ 
fect men to achieve a rewarding and not too painTut existence while they cope with 
an imperfect world." 

The word "health" is derived from the Old English Hal, meaning hale, whole, 
sound in wind and limb. 

health behavior The combination of knowledge, practices, and attitudes that to¬ 
gether contribute to motivate the actions we lake regarding health. Health behavior 
mas promote and preserve good health, or if the behavior is harmful, e.g., tobacco 
smoking, may Ik* a determinant of disease. This combination of knowledge, prac¬ 
tices, and altitudes has l>een described and discussed by several writers, notably 
Becker. 1 Sec also illness behavior. 

'Becker MM (ed): The Health Belief Model and Personal Health Behavior. Thorofare N|: Slack. 1974. 

health care Those services provided to individuals or communities by agents of the 
health services or professions, for the purpose of promoting maintaining, monitor¬ 
ing, or restoring health. Health care is broader than, and not limited to medical 
care, which implies therapeutic action by or under the supervision of a physician. 
The term is sometimes extended to include self-care. 

health education The process by which individuals and groups or people learn lo 
lx*have in a manner conductive to the promotion, maintenance, or restoration of 
health. 

health index A numerical indication of the health of a given population derived from 
a specified composite formula. The components of the formula may be infant 
mortality rates, ingidencf. rates for particular disease, or other health tnoica: 

TORS. 

health indicator A variable, susceptible to direct measurement, that reflects the state 
of health of persons in a community. Examples include infant mortality rales, inci¬ 
dence rates based on notified cases of disease, disability days, etc. These measures 
may be used as components in the calculation of a health index. 

health promotion The process of enabling people to increase control over and im¬ 
prove their health. It involves the population as a whole in the context of their 
everyday lives, rather than focusing on people at risk for specific diseases, and is 
directed toward action on the determinants or causes of health. 
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health risk APPRAISAL (HRA) (Syn: health hazard appraisal (HHAJ) A generic term 
applied to methods for describing an individual's chances of becoming ill or dying 
from selected causes. The many versions now available share several common fea¬ 
tures: Starting from the average risk of death Tor the individual s age and sex, a 
consideration of various lifestyle and physical factors indicates whether the individ¬ 
ual is at greater or less than average risk or death from the commonest causes of 
death for his age and sex. All methods also indicate what reduction in risk could 
be achieved by altering any of the causal factors (such as cigarette smoking) that 
the individual could modify. 

The premise underlying such methods is that information on the extent to which 
an individual s characteristics, habits, and health practices are influencing his future 
risk of dving will assist health care workers in counseling their patients. 
health services Services that are performed by health care professionals, or by others 
under their direction, for the purpose of promoting, maintaining, or restoring health. 
In addition to personal health care, health services include measures for health 
protection and health education. 

health services research The integration of epidemiologic, sociological, economic, 
and other analytic sciences in the study of health services. Health services research 
is usually concerned with relationships between need, demand, supply, use, and 
outcome of health services. The aim of health services research is evaluation; sev¬ 
eral components of evaluative health services research are distinguished, viz; 

Evaluation of structure, concerned with resources, facilities, and manpower. 

Evaluation of process, concerned with matters such as where, by whom, and how 
health care is provided. 

Evaluation of output , concerned with the amount and nature of health services 
provided. 

Evaluation of outcome, concerned with the results, i.e., whether persons using health 
services experience measurable benefits such as improved survival or reduced 
disability. 

health rrATimcs Aggregated data describing and enumerating attributes, events, be¬ 
haviors. services, resources, outcomes, or costs related to health, disease, and health 
services. The data may be derived from survey instruments, medical records, and 
administrative documents, vital statistics arc a subset of health statistics. 
health status index A set of measurements designed to detect short-term fluctua¬ 
tions in the health of members of a population; these measurements generally in¬ 
clude physical function, emotional well-being, activities of daily living, feelings, etc. 
Most indexes require the use of carefully composed questions designed with refer¬ 
ence to matters of fact rather than shades of opinion. The results are usually ex¬ 
pressed b\ a numerical score that gives a profile of the well-being of the individual. 
health survey A survey designed to provide information on the health status of a 
population, ft may be descriptive, exploratory, or explanatory. See also morbidity 
survey. 

healthy worker effect A phenomenon observed initially in studies of occupational 
diseases: Workers usually exhibit lower overall death rales than the general popu¬ 
lation, due to the fact that the severely ill and disabled are ordinarily excluded from 
employment. Death rates in the general population may be inappropriate for com¬ 
parison if this effect is not taken into account. 
hebdomadal mortality rate The mortality rate in the first week of life; the denom¬ 
inator is the number of live births in a year. 

Henle-Koch postulates See Root's postulates, 

herd immunity The immunity of a group or community. The resistance of a group to 
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invasion and spread of an infectious agent, based on the resistance to infection of 
a high proportion of individual members of the group. The resistance is a product 
of the number susceptible and the probability that those who are susceptible will 
come into contact with an infected person. In the herd immunity equation. ”prol> 
ability of contact” is the intervening factor that reduces susceptibility to infection 
among group members to less than that anticipated from their susceptibility as un¬ 
related individuals. 

HETERoscEDAsncmr Nonconstancy of the variance of a measure over the levels of the 
factors under study. 

HIBERNATION See VECTOR-BORNE INFECTION. 

Hippocrates of Cos (c 460-370 BC) Creek physician, “Father of Medicine,” respon¬ 
sible for careful clinical observation of many important and common diseases— 
tetanus, mumps, puerperal septicemia, etc. His writings contain important epide¬ 
miologic observations, as in the books Airs, Waters, Places, and Epidemics. His Aphor¬ 
isms also demonstrate considerable empirical epidemiologic knowledge. 

histogram A graphic representation of the frequency distribution of a variable. Rec¬ 
tangles are drawn in such a way that their bases lie on a linear scale representing 
diflerent intervals, and their heights are proportional to the frequencies of the 
values within each or the intervals. See also bar diagram. 
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Histogram. From National Center for Health Statistics. 1978. 

historical cohort study (Syn: historical prospective study, nonconcurrent prospec¬ 
tive study, prospective study in retrospect) A cohort study conducted by recon¬ 
structing data about persons at a time or limes in the past. This method uses exist- 
ing records about the health or other relevant aspects of a population as it was at 
some lime in the past and determines the current (or subsequent) status of mem¬ 
bers of this population with respect to the condition of interest. Different levels of 
past exposure to risk factor(s) of interest must be identifiable for subsets of the 
population. See also cohort study. 

historical control Control subject(s) for whom data were collected at a lime preced¬ 
ing that at which the data are gathered on the group being studied. Because of 
differences in exposures etc., use of historical controls can lead to bias in analysis. 
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Hocben numicr A unique personal identifying number constructed by using a se¬ 
quence of digits for birthdate, sex. birthplace, and other identifiers. Suggested by 
the English mathematician Lancelot Hogben. Used in primary care epidemiology 
in some countries and usable in record linkage. See also identification number; 

SOUNDEX CODE. 

Holmes, Oliver Wendell (1809-1604) Physician, poet, philosopher, autocrat ("of the 
Breakfast Table"), and crusader against puerperal fever. He argued that this was 
conveyed to patients by the contaminated hands and clothes of attending physicians 
and recommended washing the hands and changing clothes as a way to prevent it. 
Unlike semmelweis. he succeeded in convincing the medical profession. His correct 
belief was recorded in a paper. "The Contagiousness of Puerperal Fever." 1 

'N EntQj MritSurfi 1:503-530. 1842-43. 

holoendemic DISEASE A disease for which a high prevalent level of infection begins 
early in lile and affects most of the child population, leading to a state of equilib¬ 
rium such that the adult population shows evidence of the disease much less com¬ 
monly than do the children. Malaria in many communities is a holoendemic disease. 

hoijOMIantic infection See common source epidemic. 

homoscedasticitv Constancy of the variance of a measure over the levels of the fac¬ 
tors under study. 

HOS FIT A L* ACQUIRED INFECTION See NOSOCOMIAL INFECTION. 

hospital discharge abstract system Abstraction of minimum data set from hospital 
charts for the purpose of producing summary statistics about hospitalized patients. 
Examples include the Hospital Inpatient Enquiry (HIPE) and Professional Activity 
Study (PAS). The statistical tabulations commonly include length of stay by final 
diagnosis, surgical operations, specified hospital service (i.e., medical, surgical, 
gynecological, etc.) and also give outcomes such as "death" and “discharged alive 
from hospital." This system cannot generally be used for epidemiologic purposes 
as it is not possible to infer representativeness or to generalize: this is because the 
data usually lack a defined denominator and the same person may be counted more 
than once in the eveni of two or more hospital separations in the period of study. 

HOSPITAL inpatient ENQUIRY (hipe) Statistical tables of a 10% sample of hospital pa¬ 
tients in England and Wales, showing class of hospital, diagnosis, length of stay, 
outcomes, etc. 

hospital separation A term used in commentaries on hospital statistics to descril* 
(he departure of a patient from hospital without distinguishing whether (he patient 
departed alive or dead (the distinction is unimportant so far as the statistics of 
hospital activity such as bed occupancy are concerned). 

HOST 

1. A person or other living animal, including birds and arthropods, that affords 
subsistence or lodgment to an infectious agent under natural conditions. Some 
protozoa and helminths pass successive stages in alternate hosts of different 
species. Hosts in which the parasite attains maturity or passes its sexual stage 
are primary or definitive hosts; those in which the parasite is in a larval or 
asexual state are secondary or intermediate hosts. A transport host is a carrier 
in which the organism remains alive but does not undergo development. 1 

2. In an epidemiologic context, the host may be the population or group; biolog: 
ical, social, and behavioral characteristics of this group that are relevant to 
health are called "host factors." 

' Brnenson, op (U. 

household One or more persons who occupy a dwelling, i.e., a place that provides 
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shelter, cooking, washing, and sleeping facilities; may or may not be a family. The 
term is also used to describe the dwelling unit in which the persons live. 
household bample survey A survey of persons in a sample of households. This, in 
many variations, is a favored method of gathering data for health-related and for 
many other purposes. The households may be sampled in any of several ways, e.g., 
by cluster, use of random numbers in relation to numbered dwelling units. The 
survey may be conducted by interview, telephone survey, or self-completed re : 
sponses to present questions. The method is used in developing nations as well as 
in the industrial world. 

human blood index Proportion of insect vectors found to contain human blood. 

HUMAN ECOLOGY See ECOLOGY. 

human immunodeficiency virus (hiv) The pathogenic organism responsible for the 
acquired immunodeficiency syndrome (AIDS); formerly or also known as the 
Ivmphadenopathv virus (LAV), the name given by the original French discoverers 
Montagnier et al.’ in 1983. or the human T<ell lympbotrop'ie virus, type III (HTLV- 
III). the name given by Gallo et al.* to the virus they reported in 1984. 

1 Rarrc-Sinoussi F. Cher man JC. Rev F. el al.; Isolation of a T-lvmphmropic retrovirus from a 
patient at risk for acquired immune deficiency svndrome (AIDS). Science 220:868-871. 1983. 
2 Call<» RC. Salahuddm 5Z. Popovic M. et al.: Frequent detection and isolation of cvtopathic retro¬ 
viruses (HTLV-III) from patients with AIDS and at risk for AIDS. Science 224:590-503. 1984. 
hyperendemic disease A disease that is constantly present at a high incidence and/or 
prevalence rale and affects all age groups equally. 
hyperceometric distribution The exact probability distribution of the frequencies in 
a two-bv-iwo contingency table, conditional on the marginal frequencies being fixed 
at their observed levels. 
hypothesis 

1. A supposition, arrived at from observation or reflection, that leads to refutable 
predictions. 

2. Any conjecture cast in a form that will allow it to be tested and refuted. 

See also null hypothesis. 
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iatrogenic duease Illness resulting from a physician’s professional activity* or from 
the professional activity of other health professionals. 

icd See international classification of disease. 

iceberg phenomenon Thai portion or disease that remains unrecorded or undetected 
despite physicians’ diagnostic endeavors and community disease surveillance pro¬ 
cedures is referred to as the “submerged portion of the iceberg.” Detected or di¬ 
agnosed disease is the “lip of the iceberg.” The submerged portion comprises dis¬ 
ease not medically attended, medically attended but not accurately diagnosed, and 
diagnosed but not reported. 1 

'Last JM: The Iceberg Unttt. 2:28-31, 1963. 

ichppc See international classification of health problems in primary CARE. 

identification number, IDENTIFYING number Unique number given to every individ¬ 
ual at birth or at some other milestone. Sweden has a system based on a sequence 
of digits for birthdate, sex, birthplace* and additional digits for each individual. 
Other systems, e.g., National Insurance number in the United Kingdom, Social 
Security number in the United States, and Social Insurance number in Canada, are 
sometimes used but are neither universal nor unique, being sometimes applied to 
whole families or at least to more than one individual. See also hogben number; 

SOUNDEX CODE. 

idiosyncrasy Webster's Dictionary defines this as a distinctive characteristic or peculi¬ 
arity of an individual. In pharmacoepidemiology* it means an abnormal reaction, 
sometimes genetically determined, following the administration of a medication. 

ILLNESS See DISEASE. 

illness behavior Conduct of persons in response to abnormal body signals. Such be¬ 
havior influences the manner in which a person monitors his body, defines and 
interprets his symptoms, Lakes remedial actions, and uses the health care system. 
See also health behavior. 

immunity, acquired Resistance acquired by a host as a result of previous exposure to 
a natural pathogen or foreign substance for the host, e g., immunity to measles 
resulting from a prior infection with measles virus. 

immunity, active Resistance developed in response to stimulus by an antigen (infect¬ 
ing agent or vaccine) and usually characterized by the presence of antibody pro¬ 
duced by the host. 

immunity, natural Species-determined inherent resistance to a disease agent, e.g., re¬ 
sistance of man to virus of canine distemper. 

immunity, passive Immunity conferred by an antibody produced in another host and 
acquired naturally by an infant from its mother or artificially by administration of 
an antibody-containing preparation (antiserum or immune globulin). 

immunity, specific A state of altered responsiveness to a specific substance acquired 
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through immunization or natural infection. For certain diseases (e.g., measles, 
chickenpox) this protection generally lasts Tor the life of the individual. 

immunization (Syn: vaccination) Protection of susceptible individuals from communi¬ 
cable disease by administration of a living modified agent (as in yellow fever), a 
suspension of killed organisms (as in whooping cough), or an inactivated toxin (as 
in tetanus). Temporary passive immunization can be produced by administration 
of antibody in the form of immune globulin in some conditions. 

impairment A physical or mental defect at the level of a body.sysiem or organ. See 
also international classification of impairments, disabilities, and handicaps 
for the official WHO definition. 

inapparENt infection (Syn: subclinical infection) The presence of infection in a host 
without occurrence of recognizable clinical signs or symptoms. Of epidemiologic 
significance because hosts so infected, though apparently well, may serve as silent 
or inapparent disseminators or the infectious agent. See also disease, preclinical; 
disease, subclinical; vector-borne infection. 

inception rate The rate at which new spells of illness occur in a population; a term 
applied principally to short-term spells of illness such as acute respiratory infec¬ 
tions, and preferred by some epidemiologists because an annual incidence rate for 
such conditions may exceed the numbers in the population at risk. 

incidence (Syn: incident number) The number of instances of illness commencing, or 
of persons falling ill. during a given period in a specified population. 1 More gen- 
erallv, the number of new events, e g., new cases of a disease in a defined popula¬ 
tion, within a specified period of time. The term incidence is sometimes used to 
denote incidence rate. 

'Prevalence and Incidence. WHO Bui 35:783-784. 1966. 

incidence density The person-time incidence rate: sometimes used to describe the 
hazard rate. See force of morbidity. 

incidence*density ratio (idr) The ratio of two incidence densities. See also rate ra- 
tio. 

incidence rate The rate at which new events occur in a population. The numerator 
is the number of new events that occur in a defined period; the denominator is the 
population at risk or experiencing the event during this period, sometimes ex¬ 
pressed as person-time. The incidence rate most often used in public health prac¬ 
tice is calculated by the formula 

Number of new events in specified period ^ 

Number of persons exposed to risk ~ 
during this period 

In a dynamic population, the denominator is the average size of the population, 
often the estimated population at the mid:period. If the period is a year, this is the 
annual incidence rale. This rate is an estimate of the person-time incidence rale, 
i.e., the rate per 10* persomyears. If the rate is low, as with many chronic diseases, 
it is also a good estimate of the cumulative incidence rate. In follow-up studies with 
no censoring, the incidence rate is calculated by dividing the number of new cases 
in a specified period by the initial size of the cohort of persons being followed; this 
is equivalent to the cumulative incidence rate during the period. If the number of 
new cases during a specified period is divided by the sum of the person-time units 
at risk for all persons during the period, the result is the person-time incidence 
rate. 
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INCIDENCE fTUDY See COHORT STUDY. 

INCIDENT Nl/lfBER See INCIDENCE. 

INCUBATION PERIOD 

1. The lime interval between invasion by an infectious agent and appearance of 
the first sign or symptom of the disease in question. 

2. In a vector, the period between entry of the infectious agent into the vector 
and the lime at which the vector becomes infective; ».e., transmission of the 
infectious agent from the vector to a fresh Final host is possible (extrinsic in* 
cubation period). 

independence Two events are said to be independent if the occurrence of one is in no 
wav predictable from the occurrence of the other. Two variables are said to be 
independent if the distribution of values of one is the same for all values of the 
other, independence is the antonym of association. 

INDEPENDENT VARIABLE 

1. The characteristic being observed or measured that is hypothesized to influ¬ 
ence an event or manifestation (the dependent variable) within the defined 
area of relationships under study; that is, the independent variable is not in¬ 
fluenced by the event or manifestation but may cause it or contribute to its 
variation. 

2. In statistics, an independent variable is one of (perhaps) several variables that 
appear as arguments in a regression equation. 

index In epidemiology and related sciences, this word usually means a rating scale, e.g., 
a set of numbers derived from a series of observations of specified variables. Ex¬ 
amples include the many varieties of health status index, scoring systems for sever¬ 
ity or stage of cancer, heart murmurs, mental retardation, etc. 

index case The first case in a family or other defined group to come to the attention 
of the investigator. See also propositus. 

index croup (Svn; index series) 

1. In an experiment, the group receiving the experimental regimen. 

2. In a case control study, the cases. 

3. In a cohort study, the exposed group. 

indicator variable In statistics, a variable taking only one of two possible values, one 
(usually I) indicating the presence of a condition, and the other (usually zero) in¬ 
dicating absence of the condition. Used mainly in regression analysis. 

indirect adjustment See standardization. 

individual variation Two types are distinguished: 

1. Intratndivuiual innation: The variation of biological variables within the same 
individual, depending upon circumstances such as the phase of certain body 
rhythms and the presence or absence of emotional stress. These variables do 
not have a precise value, but rather a range. Examples include diurnal varia¬ 
tion in body temperature, fluctuation of blood pressure, blood sugar, etc. 

2. fntmndwidual lunation: As used by Darwin, the term means variation bftuxen 
individuals. This is the preferred usage; the first usage is better described as 
personal variation. 

induction period The period required for a specific cause to produce disease. More 
precisely, the interval Trom the causal action of a factor to the initiation of the 
disease. For example, a span of many years may pass between (presumably) radiation- 
induced mutations and the appearance of leukemia; (his span would be the induc¬ 
tion period for radiogenic leukemia. See also incubation period; latent period. 
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industrial hygiene The science and art devoted to recognition, evaluation, and con¬ 
trol of those environmental factors or stresses arising from or in the workplace, 
which may cause sickness, impaired health, and well-being, or significant discomfort 
and inefficiency among workers or among persons in the community. Alternatively, 
the profession that anticipates and controls unhealthy conditions of work to prevent 
illness among employees. 

INFANT mortality rate (imr) A measure of the yearly rale of deaths in children less 
than one year old. The denominator is the number of live births in the same year. 
Defined as 


Infant mortality rate 


Number of deaths in a year of 
children less than ! year or age 
Number of live births in the same year 


x 1000 


This is often quoted as a useful indicator of the level of health in a community. 
iNEEcnBiLtTV The host characteristic or state in which the host is capable or being 
infected. See also infectiousness; infectivity. 
infection (Syn: colonization) The entry and development or multiplication of an in¬ 
fectious agent in the body of man or animals. Infection is not synonymous with 
infectious disease; the result may be inapparent or manifest. The presence of living 
infectious agents on exterior surfaces of the body is called "infestation" (e.g., pedi¬ 
culosis, scabies). The presence of living infectious agents upon articles of apparel 
or soiled articles is not infection, but represents contamination of such articles 
See also inapparent infection; transmission or infection, 
infection, gradient of The range of manifestations of illness in the host reflecting 
the response to an infectious agent, which extends from death at one extreme lo 
inapparent infection at the other. The frequency of these manifestations varies with 
the specific infectious disease. For example, human infection with the virus of ra¬ 
bies is almost invariably ratal, whereas a high proportion of persons infected in 
childhood with the virus of hepatitis A, experience a subclinical or mild clinical 
infection. 

infection, latent period of The lime between initiation of infection and first shed¬ 
ding or excretion of the agent. 
infection, subclinical See inapparent infection, 
infectious disease See communicable disease. 

infectiousness A characteristic or the disease that concerns the relative ease with which 
it is transmitted to other hosts. A droplet spread disease, for instance, is more in¬ 
fectious than one spread by direct contact. The characieristics of the portals of exit 
and entry are thus also determinants of infecliousness, as are thr agent character¬ 
istics of ability to survive away from the host, and of infectivity. 

INFECTIVITY 

1. The characteristic of the disease agent that embodies capability to enter, sur¬ 
vive, and multiply in the host. A measure of infectivity is the secondary attack 
rate. 

2. The proportion of exposures, in defined circumstances, that results in infec¬ 
tion. 

INFERENCE The process of passing from observations and axioms to generalizations. In 
statistics, the development of generalization from sample data, usually with calcu¬ 
lated degrees of uncertainty. 
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infestation The development on (rather than in) the body of a pathogenic agent, e.g.. 
body lice. Some authors use the term also to describe invasion of the gut by parasitic 

INFORMATION svstem A combination of vital and health statistical data from multiple 
sources, used to derive information about the health needs, health resources, costs, 
use of health services, and outcomes or use by the population or a specified juris¬ 
diction. The term may also describe the automatic release from computers or stored 
information in response to programmed stimuli. For example, parents can be no¬ 
tified when their children are due to receive booster doses of an immunizing agent 
against infectious disease. 

informed CONSENT Voluntary consent given by a subject or by a person responsible lor 
a subject (e g., a parent) for participation in an investigation, immunization pro¬ 
gram. treatmeni regimen, etc., after being informed of the purpose, methods, pro¬ 
cedures, benefits. and risks. Awareness of risk is necessary for any subject to make 
an informed choice. The term also refers to consent for medical care. 

INOCULATION See VACCINATION. 

INPUT . 

1. The sum total of resources and energies purposefully engaged in order to 

intervene in the spontaneous operation of a system. 

2. The basic resources required in terms of manpower, money, materials, and 
time. 

INSTANTANEOUS INCIDENCE RATE See FORCE Of MORBIDITY. 

instrumental ERROR Error due to faults arising in any or in all aspects of a measuring 
instrument, i.e.. calibration, accuracy, precision, etc. Also applied to error arising 
from impure reagents, wrong dilutions, etc. 

INTERACTION 

1. The interdependent operation of two or more causes to produce or prevent 
an effect. Biological interaction means the interdependent operation of two or 
more causes to produce, prevent, or control disease. See also antagonism, 
svnercism. 

2. Differences in the effects of one or more factors according to the level of the 
remaining fact oris). See also effect modifier. 

3. In statistics, the necessity for a product term in a linear model. 

intermediate VARIABLE (Syn: contingent variable, intervening (causal) variable, me¬ 
diator variable) A variable that occurs in a causal pathway from an independent to 
a dependent variable. It causes variation in the dependent variable, and itself is 
caused to vary by the independent variable. Such a variable is statistically associated 
with both the independent and dependent variables. 

INTERNAL VAUDmf See VALIDITY, STUDY. . 

international classification of disease (icd) The classification of specific condi¬ 
tions and groups of conditions determined by an internationally representative group 
of experts who advise the World Health Organization, which publishes the com¬ 
plete list in a periodically revised book, the (Manual of the) International Statistical 
Classification of Diseases, Injunes and Causes of Death. Every disease entity is assigned 
a number. There are 17 major divisions (chapters) and a hierarchical arrangement 
or subdivisions (rufcici) within each. Some chapters are “etiologies e g.. Infective 
and Parasitic Conditions; more relate to body systems, e g.. Circulatory System; and 
some to classes of condition, e g., neoplasms, injury (violence). The heterogeneity 
of categories reflects prevailing uncertainties about causes of disease (and classifi* 
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cation in relation to causes). The Ninth Revision of the Manual (ICD-9) was Pub¬ 
lished by WHO in 1977, after ratification in 1976. 

INTERNATIONAL CLASSIFICATION OF HEALTH PROBLEMS IN PRIMARY CASE (ICHPPC) A 
classification of diseases, conditions, and other reasons for attendance for primary 
care. May be used for labeling conditions in problem-oriented records as used by 
primary care health workers. This classification is an adaptation of the ICD bui 
makes more allowance for the diagnostic uncertainty that prevails in primary care 
This classification is now in its second revision (1CHPPC-2) See also problem- 
ORIENTED MEDICAL RECORD. 

INTERNATIONAL CLASSIFICATION OF IMPAIRMENTS, DISABILITIES, AND HANDICAPS 
(iciDH) First published by WHO in 1980, this is an attempt to produce a systematic 
taxonomy of the consequences of injury and disease. 

An impairmmt is defined in ICIDH as any loss or abnormally or psychological 
physiological, or anatomical simcture or function ii i, concerned with abnormal/ 
ties of body structure and appearance and with organ or system function resulting 
Irom any cause; m principle, impairments represent disturbances at the organ level 

A disability is defined in ICIDH as any restriction or lack (resulting from an im¬ 
pairment) of ability to perform an activity in a manner or within the range con¬ 
sidered normal for a human being, The term disability reflects the consequences of 
impairment in terms of functional performance and activity by the individual' dis¬ 
abilities thus represent disturbances at the level of the person. 

A handicap is defined in ICIDH as a disadvantage for a given individual, resulting 
Trorn an impairment or a disability, that limits or prevents the fulfillment of a role 
that is normal (depending on age. sen. and social and cultural practice) for that 
individual The term handicap thus reflects interaction with and adaptation to the 
individual $ surroundings. 

international comi-arison See ceocraphic patholocv. See also cross-cultural 

STUDY. 

INTERNAL VALIDITY Set VALIDITY* STUDY. 

interpolate, interpolation To predict the value of variates within the range of ob¬ 
servations; the resulting prediction. * 

interval incidence density See person-time incidence rate, 
interval scale See measurement scale, 
intervening clause See intermediate variable. 

INTERVENING VARIABLE. 

1. Synonym for intermediate variable. 

2. A variable whose value is altered in order to block or alter the effrct(s) of 
another factor. 

See also causality, factors in. 

intervention Study An epidemiologic investigation designed to test a hypothesized 
cause-effect relationship by modifying a supposed causal factor in a population 
interview SCHEDULE The precisely designed set of questions used in an interview See 
also survey instrument. 

involuntary smokinc (Syn: passive smoking) The inhalation by nonsmokers of to- 
bacco smoke left in the air by smokers; includes hoth smoke exhaled by smokers 
and smoke released directly from burning tobacco into ambient air: the latter is 
called sidestream smoke and contains higher proportions of toxic and other carcin¬ 
ogenic substances than exhaled smoke. The adjective "involuntary" is preferable to 
"passive" as the latter implies acquiescence-increasingly, nonsmokers are anything 
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but acquiescent about this form of air pollution. “Passive is, however, customary 
WHO usage. 

ISLAND population A group of individuals isolated from larger groups and possessing 
a relatively limited gene pool; alternatively, a group that is immunologically isolated 
and may therefore be unduly susceptible to infection with alien pathogens. 
isolate (noun) Term used in genetics to describe a subpopulation (generally small) in 
which matings take place exclusively with other members of the same subpopula- 
lion. 

ISOLATION 

1. In microbiolog), the separation of an organism from others, usually by mak¬ 
ing serial cultures. 

2. Separation, for the period of communicability, or infected persons or animals 
from others in such places and under such conditions as to prevent or limit 
the direct or indirect transmission of the infectious agent from those infected 
to those who are susceptible or who may spread the agent to others. Control of 
Communicable Disease i»i Afon’ lists seven categories of isolation as lollows: 

a. 5/nc/ uolatton: This category is designed to prevent transmission of highly 
contagious or virulent infections that may be spread by both air and con¬ 
tact. The specifications, in addition to those above, include a private room 
and tbe use of masks, gowns, and gloves for all persons entering the room. 
Special ventilation requirements with the room at negative pressure to sur¬ 
rounding areas are desirable. 

b. Contact isolation: For less highly transmissible or serious infections, for dis¬ 
eases or conditions that arc spread primarily by close or direct contact. In 
addition to the basic requirements, a private room is indicated but patients 
infected with the same pathogen may share a room. Masks are indicated 
for those who come close to the patient, gowns are indicated if soiling is 
likely, and gloves are indicated for touching infectious material. 

c. Resfnraiaty notation: To prevent transmission of infectious diseases over short 
distances through the air. a private room is indicated but patients infected 
with the same organism may share a room. In addition to the basic require¬ 
ments, masks are indicated for those who come in close contact with the 
patient: gowns and gloves are not indicated. 

d. Tuberculosis isolation (AFB isolation): For patients with pulmonary tubercu¬ 
losis who have a positive sputum smear or chest-x-rays that strongly suggest 
active tuberculosis. Specifications include use of a private room with special 
ventilation and the door closed. In addition to the basic requirements, masks 
are used only if the patient is coughing and does not reliably and consis¬ 
tently cover the mouth. Gowns are used to prevent gross contamination of 
clothing. Gloves are not indicated. 

e. Entenc precautions: For infections transmitted by direct or indirect contact 
with feces. In addition to the basic requirements, specifications include use 
of a private room if patient hygiene is poor. Masks are not indicated: gowns 

, should be used if soiling is likely and gloves are to be used for touching 

contaminated materials. 

f. Drainage!secretion precautions: To prevent infections transmitted by direct or 
indirect contact with purulent material or drainage from an infected body 
site. A private room and masking are not indicated; in addition to the bask 
requirements, gowns should be used if soiling is likely and gloves used for 
touching contaminated materials. 
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g. Blood/body fluid precautions: To prevent infections that are transmitted by 
direct or indirect contact with infected blood or body fluids. In addition to 
the basic requirements, a private room is indicated if patient hygiene is 
poor; masks are not indicated; gowns should be used if soiling of clothing 
with blood or body fluids is likely. Gloves should be used for touching 
blood or body fluids. 

See also quarantine. 

• Benenson AS (Ed): Control of Communicable Diseases in Afan, 14th ed. Washington DC: American 

Public Health Association. 1985. 

isometric chart A chart or graph that portrays three dimensions on a plane surface. 
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jackknife A technique for estimating the variance and the bias of an estimator. If the 
sample size is n, the estimator is applied to each suhsample of size n- 1, obtained 
by dropping a measurement from analysis. The sum of squared differences be¬ 
tween each of the resulting estimates and their mean, multipled by (n - l)/n, is the 
jackknife estimate of variance; the difference between the mean and the original 
estimate, multiplied by (n - 1). is the jackknife estimate of bias. 

Jenner, Edwakd (1749-1823) An English physician and naturalist. On the basis of the 
observation that dairymaids who had had cowpox never got smallpox, he inoculated 
a boy age 10 with cowpox (vaccinia) in 1796. Over the succeeding two years he 
inoculated 22 more persons and then attempted to inoculate them with smallpox, 
always without inducing this infection. The results of his work were published in 
An Inquiry into the Cause and Effects of the Variolas Vacanae (London. 1798), This 
successful method of immunizing persons and populations against smallpox led 
directlv to the Ultimate worldwide eradication of smallpox in 1977. 

RAP (knowledge, attitudes, fractice) SURVEY A formal survey, using face-to-face 
interviews, in which women are asked standardized pretested questions dealing with 
their knowledge ot, altitudes toward, and use of contraceptive methods. Detailed 
reproductive histories and attitudes toward desired family size are also elicited. 
Analvsis of responses provides much useful information on family planning and 
gives estimates of possible future trends in population structure. The term has 
sometimes been used to describe other varieties of survey of knowledge, attitudes, 
and practice, e.g., health promotion in general or in particular, cigarette smoking. 
kappa A measure of the degree of nonrandom agreement between observers or mea¬ 
surements of the same categorical variable 


where P v is the proportion of limes the measurements agree, and P, is the propor¬ 
tion of times they can be expected to agree by chance alone. If the measurements 
agree more often than expected by chance, kappa is positive; if concordance is 
* complete, kappa* I; if there is no more nor less than chance concordance, kappa*0; 
if the measurements disagree more than expected by chance, kappa is negative. 

Kendall’s tau See correlation coeeticient. 

Koch, Robert (1843-1910) German physician, pathologist, and bacteriologist. One of 
the founders of microbiology and an important contributor to our understanding 
of infectious disease epidemiology. His major contributions to medical science in¬ 
clude the life cycle of anthrax, the etiology of traumatic infection, methods of fixing 
and staining bacteria, and, in 1882, the discovery of the tubercle bacillus. The paper 
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reporting this contained the first statement of koch’s postulates. In 1883, he dis¬ 
covered the cholera vibrio. He was awarded the Nobel Prize in 1905. 

Koch’s postulates First formulated by Henle and adapted by Robert Koch in 1877, 
with elaborations in 1882. Koch stated that these postulates should be met before a 
causative relationship can be accepted between a particular bacterial parasite or 
disease agent and the disease in question. 

I The agent must be shown to be present in every case of the disease by isolation 
in pure culture. 

2. The agent must not be found in cases of other disease. 

3. Once isolated, the agent must be capable of reproducing the disease in exper¬ 
imental animals. 

4. The agent must be recovered from the experimental disease produced. 

See also causality; evans’s postulates. 

kurtosis The extent to which a unimodal distribution is peaked. 
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UMtCE sample method (5yn: asymptotic method): Any statistical method based on an 
approximation to a normal or other distribution that becomes more accurate as 
sample size increases. An example is a chi square test on a set of frequencies. 

latent immunization The process of developing immunity by a single or repeated 
inapparent asymptomatic infection. Not necessarily related to latent infection. See 
also IMMUNITY. ACQUIRED. 

LATENT infection Persistence of an infectious agent within the host without symptoms 
(and often without demonstrable presence in blood, tissues, or bodily secretions of 
host). 

LATENT PERIOD (Svn: latency) Delay between exposure to a disease-causing agent and 
the appearance or manifestations of the disease. After exposure to ionizing radia¬ 
tion. lor instance, there is a latent period of five years, on average, before devel¬ 
opment of leukemia, and more than 20 years before development of certain other 
malignant conditions. The term ‘‘latent period" is often used synonymously with 
‘■induction period," that is! the period between exposure to a disease-causing agent 
and the appearance of manifestations of the disease. It has also been defined as the 
period from disease initiation to disease detection. See also incubation period; 

INDUCTION PERIOD. 

latin square One of the basic statistical designs for experiments that aim at removing 
from the experimental error the variation from two sources, which may be identi¬ 
fied with the rows and columns of the square. In such a design the allocation of A 
experimental treatments in the cells of a A by A (latin) square is such that each 
treatment occurs exactly once in each row and column. A design for a 5 x 5 square 
is as follows: 

A B C D E 

B A E C D 

C D A E B 

D E B A C 

E C D B A 

After Kendall and Buckland.' 

' Kendall MG. Bucklind AA: A Dtrlianan of Slalulkal Trrms. 4lh ed London: Longman. 1982. 

Laveran, Alphonse (1845-1922) French army surgeon who discovered the malaria 
parasite (I860) while on service in Algeria. Though initially sceptical, the scientific 
community soon accepted the validity of Laveran s discovery, which was confirmed 
and enlarged by Golgi. Grassi, and others. Laveran was awarded ihe Nobel Prize 
for medicine in 1907. 
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lead time The time gained in treating or controlling a disease when detection is earlier 
than usual, e.g.. in the presymptomatic stage, as when screening procedures are 
used for detection. 

lead time bias (Syn: zero time shift) Overesumation of survival time, due lo the back¬ 
ward shift in the starting point for measuring survival that arises when diseases 
such as cancer are detected early, as by screening procedures. 

least *<l UA * ES ^ principle of estimation, due lo Gauss, in which the estimates or a set 
of parameters in a suiistical model arc those quantities that minimize the sum of 
squared differences between the observed values of the dependent variable and the 
values predicted by the model. 

LEDERMann formula Ledcrmann 1 showed empirically that the frequency distribution 
of alcohol consumption in the population of consumers may be log-normal; the 
curve is sharply skewed—approximately one-third of drinkers consume more than 
609f of the toul amount of alcohol. Among drinkers the proportion of persons 
with alcoholism remains consunt at around 7-9%. The pattern of consumption of 
illicit drugs among users may also he log-normal. Questions have been raised, how¬ 
ever. about the validity of some assumptions upon which the formula is based. 

1 Ledcrmann S: Alcoot, AUoolumr el AUooluolion Paris: Presses universitaires de France. 1956. 

Leeuwenhoek, Antoni van (1632-1723) An early microscopist from Delft, in the 
Netherlands, the first lo use his microscopes lo examine and describe small crea¬ 
tures (animaUulrs) such as the protozoan organisms in vaginal secretions, sperma¬ 
tozoa, and with growing ability lo make more powerful microscopes, infectious mi¬ 
croorganism. He was thus a key figure in the development of the germ theory oF 
disease. 

Levin’s attributable risk See attributable fraction (population). 

life events Changes or disruptions in the pattern of living that may be associated with 
or produce changes in health. The relationship of “life stress" and "emotional stress" 
to onset of several kinds of serious chronic disease such as coronary heart disease 
and hy|>ertension has been the subject of epidemiologic studies. The Rahe-Holmes 
Social Readjustment Rating Scale* was the first to be developed to assign ranks or 
ratings to significant life events such as death of a spouse or other close relative, 
loss of regular job. relocation, marriage, divorce, etc. Many other rating scales have 
since been developed. 

'Holmes TM. Rahe RH: The social readjustment rating scale. J Pnehotomahe Rn 1:213-218. 1967. 

life expectancy See expectation of life. 

LIFE EXPECTANCY rREE FROM DISABILITY (lefd) An estimate or life expectancy adjusted 
for activity-limitation (dam For which arc derived from hospital discharge statistics, 
etc ). Sec also Qaly. 

life style The set of habits and customs that is influenced, modified, encouraged, or 
constrained by the lifelong process or socialization. These habits and customs in¬ 
clude use of substances such as alcohol, tobacco, tea, cofree; dietary habits, exercise, 
etc., which have important implications Tor health and arc often the subject of epi¬ 
demiologic investigations. 

life TA * LC A summarizing technique used to describe the pattern of mortality and 
survival in populations. The survival data are time specific and cumulative proba¬ 
bilities of survival of a group or individuals subject, throughout life, lo the age- 
specific death rates in question. The life table method can he applied to the study 
not only of death, but also of any defined endpoint such as the onset of disease or 
the occurrence of specific complication(s) of disease. The survivors to age x are 
denoted by the symbol the expectation of life at age x is denoted by the symbol 
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and the proportion alive at age x who die between age * and x+ I years is de¬ 
noted by the symbol nq r The lire table method is used extensively in epidemiology 
and in many assessments of treatment regimens in clinical practice. 

The first rudimentary life tables were published in 1693 by the astronomer Ed¬ 
mund Halley. These made use of records of the funerals in the city of Breslau. In 
1815 in England, the first actuarially correct life table was published, based on both 
population and death daU classified by age. 

Two types of life tables may be distinguished according to the reference year of 
the table: the current or period life table and the generation or cohort life table. 

The current life table is a summary of mortality experience over a brief period 
(one to three years), and the population data relate to the middle of that period 
(usually close to the date of a census). A current life table therefore represents the 
combined mortality experience by age of the population in a particular short period 
of time. 

The cohort or generation life table describes the actual survival experience of a 
group, or cohort, of individuals born at about the same time. Theoretically, the 
mortality experience of the persons in the cohort would be observed from their 
moment of birth through each consecutive age in successive calendar years until all 
of them die. 

The clinical life table describes the outcome experience of a group or cohort of 
individuals classified according to their exposure or treatment history. 

Life tables are also classified according to the length of age interval in which the 
data are presented. A complete life table contains data for every single year of age 
from birth to the last applicable age. An abridged life table conuins data by inter¬ 
vals of five or ten years of age. See also expectation or life, survivorship stuov. 

urt table, expectation or Life Function, i , (Syn: average future lifetime) The ex, 
pectation of life function is a staiemcnt of the average number of years of life 
remaining to persons who survive to age x. 

urn table, survivorship function, Z, The survivorship function is a statement of 
the number of persons out of an initial population of defined size, e.g., 100,000 
live births, who would survive or remain free of a defined endpoint condition to 
age x under the age-specific rates for the specified year. The value of /«<>. for ex¬ 
ample, is determined by the cumulative operation of the specific death rates for all 
ages below 40. 

lifetime risk The risk, to an individual that a given health effect will occur at any time 
after exposure, without regard for the time at which that effect occurs. 

likelihood function A function constructed from a statistical model and a set of ob¬ 
served data, which gives the probability of the observed data for various values of 
the unknown model parameters. The parameter values that maximize the proba¬ 
bility are the maximum likelihood estimates of the parameters. 

likelihood katio test A statistical test based on the ratio of the maximum value of 
the likelihood function under one statistical model to the maximum value under 
another statistical model; the models differ in that one includes, the other excludes, 
one or more parameters. 

Und, James (1716-1794) British naval surgeon; contributed to improved hygiene aboard 
ships Conducted what amounted to epidemiologic experiments (albeit with small 
numbers) which established that scurvy could be prevented by fresh fruits such as 
lemons and oranges. 

linear model A statistical model in which the value of a parameter for a given value 
of a factor, x, is assumed to be equal to a + far, where a and b are constants. 


Linear regression Regression analysis of data using linear models. 

linkage Sec genetic linkage; record linkage. 

UVE BIRTH WHO definition adopted by Third World Health Assembly. 1950: Live 
birth is the complete expulsion or extraction from its mother of a product of con- 
ception, irrespective of the duration of the pregnancy, which, after such separation, 
breathes or shows any other evidence of life, such as beating of the heart, pulsation 
of the umbilical cord, or definite movement of voluntary muscles, whether or not 
the umbilical cord has been cut or the placenta is attached: each product of such a 
birth is considered live born. 

In the Report of WHO Expert CommxUer on Prevention of Pennatal Mortality and Mor¬ 
bidity {Technical Report Senes 457, 1970), it is noted that the above definition requires 
Ihe inclusion as live births of very early and patently non viable fetuses and that 
accordingly it is not strictly applied. The committee suggested, therefore, that WHO 
should introduce a viability criterion into the definition so that very immature fe¬ 
tuses surviving for very short periods were excluded, even though they showed one 
or more of the transitory signs of life. 

LOCUS 

1. The position of a point, as defined by the coordinates on a graph. 

2. The position that a gene occupies on a chromosome. 

Lod score In genetics, the log odds ratio or observed to expected distribution of ge¬ 
netic markers. 

logistic model A statistical model of an individual's risk (probability of disease t) as a 
function of a risk factor x: 

where r is the (natural) exponential function. This model has a desirable range, 0 
to I, and other attractive statistical features. In the multiple logistic model, the term 
fix is replaced by a linear term involving several factors, e.g.. + fijx, if there are 

two factors x, and x } . 

logfe (Syn: log-odds) The logarithm of the ratio of frequencies of two different cate¬ 
gorical outcomes such as healthy versus sick. 

Locrr model A linear model for the logit (natural log of the odds) of disease as a 
function of a quantitative factor X; 

Logit (disease given X * x) «= a + fix 

This model is mathematically equivalent to the logistic model. 

LOC-LINEAR model A statistical model that uses an analysis or variance type of ap¬ 
proach for the modeling of frequency counts in contingency tables. 

log-normal DISTRIBUTION If a variable Y is such that X-log 1’ is normally distributed, 
it is said to have log-normal distribution. This is a skew distribution. See also 

NORMAL DISTRIBUTION. 

LoNcrruoiNAL btudy See cohort study. 

Louis, Pierre-Charles-Alexandre (1787-1872) French physician and maihemati 
cian. One of the founders of medical statistics, his research on tuberculosis, which 
included dissection of 358 specimens and study of I960 clinical cases, led to publi¬ 
cation of Rrchrrches onatomicopathobgujues sur la phihtsie (Paris, 1825). This work and 
others are marked by rigorous numerical precision and demonstration of similari¬ 
ties and differences based upon numerical distribution of data The Lilienfelds 1 
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have pointed out that Louis greatly influenced the development of statistics as ap¬ 
plied in biology and medicine; he either taught or otherwise directly influenced 
many European. British, and American workers, including Will.am Farr. John Si- 
mon, William Augustus Guy. and William Budd in England. George Shatiuck, Elisha 
Barnett, and Alonzo Clark in the United Slates, and Joseph Skoda in Hungary*; 
those he influenced handed on these important concepts to their own pupils. 

'Lilienleld AM. Lilienfeld D. Threads of epidemiological history , in Foundations of EpuUimolo^, 
2nd Ed (New York: Oxlord. 1980). pp. 23-45. 

LOW BIRTH WEIGHT See BIRTH WEIGHT. 

"LUMriNC AND splitting" Derisive term describing the propensity of epidemiologists 
to group related phenomena or to separate phenomena that hitherto have been 
grouped. Epidemiologists are sometimes called "lumpers and splitters." 






malaria cndemicyty Certain terms used to describe the occurrence of malaria, based 
on enlarged spleen rates are categorized by WHO as follows: 

1. Hypoendemic: Spleen rate in children 2-9 years <10%. 

2 Mesoendemic: Spleen rate 11-50%. 

3. Hyperendemic: Spleen rale in children over 50%, in adults usually over 25%.. 

4. Holoendemic: Spleen rate in children constantly over 75%, adult rate low. 
MALARIA periodicity Recurrence at regular intervals of symptoms; periodicity may be 

quotidian, tertian, or quartan, according to the interval between paroxvsms: 

I Quartan. Recurring every third day, i.e., day I, day 4. day y, etc. 

2. Quotidian. Recurring daily. 

3. Tertian. Recurring every alternate day. i.e., day I, day 3 etc. 

malaria patent pcriod Period during which parasites are present in peripheral blood. 
malaria reproduction rate Estimated number of malarial infections potentially di?r 
tributed by the average nonimmune infected individual in a community where nei¬ 
ther persons not mosquitoes were previously infected. 
malaria survey Investigation in selected age:group samples in randomly selected lo¬ 
calities to assess malaria cndemicity; uses spleen and/or parasite rates as measure of 
endemicity. 

Malthus, Thomas Robert (1766-1834) An English clergyman and natural scientist 
who argued in An Essay on tht Principle of Population (London, 1798) (hat popula¬ 
tions increase in geometric progression while food supplies increase only in arith¬ 
metical progression, thus making famine inevitable. His work justifies his recogni¬ 
tion as one of the founders of demography, even though events proved his predictions 
wrong (at least in the short term). 

Manson, Patrick (1844-1922) Studied tropical diseases in China and made many con¬ 
tributions of fundamental importance, notably the transmission of fdariasis by cul 
icine mosquitoes, pans or the life cycle of schistosomes. He investigated and ob¬ 
served many other tropical parasitic diseases and founded the London School of 
(Hygiene and) Tropical Medicine in 1898. 

Mantel-Haenszel estimate, Mantel-Haenszel odds ratio Mantel and Haens/el 1 
provided an adjusted odds ratio as an estimate of relative risk that may be derived 
from grouped and matched sets of data. It is now known as the Mantel-Haenszel 
estimate, one of the few eponymous terms of modern epidemiology. 

The statistic may be regarded as a type or weighted average of the individual 
odds ratios, derived from stratifying a sample inter a series of strata that are inter¬ 
nally homogeneous with respect to confounding factors. 

The Mantel-Haenszel summarization method can also be extended to the sum¬ 
marization of rate ratios and rate differences from follow-up studies. 

’Mantel N, Macmrcl W: Statistical aspects of the analysis of data from retrospective studies of 
disease J F!oti Comer Inst 22:719-748, 1959. 
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Mantel-Haenszel tc*t A summary chi-square test developed by Mamet and 
Hacnszel for stratified data and used when controlling for confounding 

margin OF safety An estimate of the ratio of the nenobserved-effect level (NOEL) to 
the level accepted in regulations. 

marginals The row and column totals or a contingency table. 

Markov process A stochastic process such that the conditional probability distribution 
for the slate at any future instant, given the present stale, is unaffected by any 
additional knowledge of the past history of the system. 

MASKED STUDY See BLINDED STUDY. 

masking Procedure(s) intended to keep participant^) in a study from knowing some 
fact(s) or observation(s) that might bias or influence their actions or decisions re¬ 
garding the study. 

matched controls See controls, matched. 

matching The process of making a study group and a comparison group comparable 
with respect to extraneous factors. Several kinds of matching can be distinguished: 

Caliper matching is the process or matching comparison group subjects to study 
group subjects within a specified distance for a continuous variable (e g., matching 
age to within two years). 

Frequency matching requires that the frequency distributions of the matched 
variable(s) be similar in study and comparison groups. 

Category matching is the process of matching study and control group subjects 
in broad classes such as relatively wide age ranges or occupational groups. 

Individual matching relies on identifying individual subjects for comparison, each 
resembling a study subject on the matched variable(s). 

Pair matching is individual matching in which study and comparison subjects are 
paired. 

maternal mortality (rate) The risk of dying from causes associated with childbirth 
is measured by the maternal mortality rate. For this purpose the deaths used in the 
numerator are those arising during pregnancy or from puerperal causes, i.e., deaths 
occurring during and/or due to deliveries, complications of pregnancy, childbirth, 
and the puerperium. Women exposed to the risk of dying from puerperal causes 
are those who have been pregnant during the period. Their number being un : 
known, the number of life births is used as the conventional denominator for com¬ 
puting comparable maternal mortality rates. The formula is 

Number of deaths from puerperal 
causes in a given geographic area 

Annual maternal^ _ during a given year _ x ( or joo.OOO) 

mortality rate Number of live birthsTliai 

occurred among the population of 
the given geographic area during 
the same year 

There is variation in the duration of the postpartum period in which death may 
occur and be certified due to • puerperal causes," i.e., “maternal mortality." Accord¬ 
ing to WHO, a maternal death is defined as the death or a woman while pregnant 
or within 42 days of termination of pregnancy, irrespective of the duration and the 
site of pregnancy, from any cause related to or aggravated by the pregnancy or ns 
management but not from accidental or incidental causes. 

Maternal deaths should be subdivided into two gToups: (1) direct obstetric deaths, 
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resulting from obstetric complications of the pregnant state, and (2) indirect obstet- 
ric deaths, resulting from preexisting disease or conditions not due to direct obstet¬ 
ric causes. 

Although WHO defines maternal mortality as death during pregnancy or within 
42 days of delivery, in some jurisdictions, a period as long as a year is used. 

mathematical model A representation of a system, process, or relationship in math¬ 
ematical form in which equations are used to simulate the behavior of the system 
or process under study. The model usually consists of two parts: the mathematical 
structure itself, e g.. Newton s inverse square law or Gauss’s '‘normal" law, and the 
particular constants or parameters associated with them, such as Newton’s gravita¬ 
tional constant or the Gaussian standard deviation. 

A mathematical model is deterministic if the relations between die variables in¬ 
volved take on values not allowing Tor any play of chance. A model is said to be 
statistical, stochastic, or random, if random variation is allowed to enter the picture. 
5ee also model. 

MAXIMUM ALLOWABLE CONCENTRATION (MAC) See SAFETY STANDARDS. 

maximum likelihood ESTIMATE The value for an unknown parameter that maximizes 
the probability of obtaining exactly the data that were observed. 

McNemar’s test A form of the chi-square test for matched-pairs data. It is a special 
case of the mantel-haenszel test. 

mean, arithmetic A measure of central tendency. Calculable only for positive val¬ 
ues It is calculated by taking the logarithms of the values, calculating their arith¬ 
metic mean, then convening back by taking the antilogarithm. 

mean, harmonic A measure or central tendency computed by summing the recip¬ 
rocals of all the individual values and dividing the resulting sum into the number 
of values. 

measure of association A quantity that expresses the strength of association between 
variables. Commonly used measures of association are differences between means, 
proportions or rates, the rate ratio, the odds ratio, and correlation and regression 
coefficients. 

measurement The procedure of applying a standard scale to a variable or to a set of 
values. 

measurement, problems with terminology There is sometimes uncertainty about 
the terms used to describe the properties of measurement: accuracy, precision, va¬ 
lidity, reliability, repeatability, and reproducibility. Accuracy and precision are often 
used synonymously, validity is defined variously, and reliability, repeatability, and 
reproducibility are frequently used interchangeable. 

Etymologies are helpful in making a case for preferred usages, but they are not 
always decisive. Accuracy is from the Latin cura, care, and while this may be of 
interest to those in the health field, it does not illuminate the origins of the standard 
definition, that is, “conforming to a standard or a true value" ( OED ). Accuracy is 
distinguished from precision in this way: A measurement or statement can reflect 
or represent a true value without detail. A temperature reading of 98.6*F is accu¬ 
rate. but it is not precise if a more refined thermometer registers a temperature of 
98.637 # F. 

Precision (from Latin prtucuUrt, cut short) is the quality of being sharply defined 
through exact detail. A faulty measurement may be expressed precisely, but may 
not be accurate. Measurements should be both accurate and precise, but the two 
terms are not synonymous. Consistency or reliability describes the property of mea : 
surernems or results that conform to themselves. 
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Reliability (Latin rekgare, to bind) is defined by the OED as a quality that is sound 
and dependable. Its epidemiologic usage is similar; a result or measurement is said 
to be reliable when it is stable, i.e., when repetition of an experiment or measure¬ 
ment gives the same results. The terms “repeatability” and “reproducibility" are 
synonymous (the OED defines each in terms of the other), but they do not refer to 
a quality of measurement, rather only to the action of performing something more 
than once. Thus, a way of discovering whether or not a measurement is reliable is 
to repeat or reproduce it. The terms “repeatability” and “reproducibility ” formed 
from their respective verbs, are used inaccurately when they are substituted for 
“reliability,” a noun that refers to the measuring procedure rather than the at¬ 
tribute being measured. However, in common usage, both repeatability and re* 
producibility refer to the capacity of a measuring procedure to produce the same 
result on each occasion in a series of procedures conducted under identical condi¬ 
tions. 

Validity is used correctly when it agrees with the standard definition given by the 
OED * sound and sufficient.” If, in the epidemiologic sense, a lest measures what it 
purports to measure (it is sufficient) then the test is said to be valid. See also accu¬ 
racy: precision; reliability*, repeatability; validity, 
measurement scale The complete range of possible values for a measurement (e g., 
the set of possible responses to a question, the physically possible range for a set of 
body weights). Measurement Kales are sometimes classified into five major types, 
according to the quantitative character of the scale: 

1. Dichotomous scale: One that arranges items into either of two mutually exclusive 
categories. 

2. Sommat scale: Classification into unordered qualitative categories; e g., race, 
religion, and country of birth as measurements of individual attributes are 
purely nominal Kales, as there is no inherent order to their categories. 

3. Ordinal scale Classification into ordered qualitative categories, e g., social class 
(I. ||, III etc ), where the values have a distinct order, but their categories are 
qualitative in that there is no natural (numerical) distance between their pos¬ 
sible values. 

4. Intmxil scale An (equal) interval involves assignment of values with a natural 
distance between them, so that a particular distance (interval) between two 
values in one region of the scale meaningfully represents the same distance 
between two values in another region of the scale. Examples include Celsius 
and Fahrenheit temperature, date of birth. 

5. Ratio scale: A ratio is an interval Kale with a true zero point, so that ratios 
between values are meaningfully defined. Examples are absolute temperature, 
weight, height, blood count, and income, as in each case it is meaningful to 
speak of one value as being so many limes greater or less than another value. 

MEASURES or central tendency A general term for several characteristics of the dis¬ 
tribution of a set of values or measurements around a value or values at or near 
the, middle of the set. The principal measures of central tendency are the mean 
(average), median, and mode. (Sec entries under each.) 
mechanical transmission See vector-borne infection. 

median A measure or central tendency. The simplest division of a set of measure: 
ments is into two parts—the lower and the upper half. The point on the Kale that 
divides the group in this way is called the “median.” 
mediator (mediating) variable See intermediate variable. 

medical audtt A health service evaluation procedure in which selected data from pa- 


migrant studies 

lients; charts are summarized in tables displaying such data as average length of 
stay or duration of an episode of care, the frequency of diagnostic and therapeutic 
procedures, and outcomes of care arranged by diagnostic category. These are often 
compared with predetermined norms. 
medical care See health care. 

medical record A file of information relating lo transaction(s) in personal health care. 
In addition to facts about a patient s illness, medical records nearly always contain 
other information. The full range of data in medical records includes the following: 

1. Clinical, i.e., diagnosis, treatment, progress, etc. 

2. Demographic, i.e., age, sex, birthplace, residence, etc. 

3. Sociocultural, i.e., language, ethnic origin, religion, etc. 

4. Sociological, i.e., family (next of kin), occupation, etc. 

5. Economic, i.e., method of pavment (fee-for service, indigent, etc.). 

6. Administrative, i.e., site of care, provider, etc. 

7. “Behavioral,” e.g., record of broken appointment may indicate dissatisfaction 
with service provided. 

^ medical statistics See biostatistics. 

Mendel’s laws Derived from the pioneering genetic studies or Gregor Mendel (1822- 
1884). Mendel s first law states that genes are particulate units that segregate; i.e., 
members of the same pair of genes are never present in the same gamete, but 
always separate and pass to different gametes. Mendel $ second law states that genes 
assort independently; i.e., members of different pairs of genes move to gametes 
independently of one another. 

meta-analysis The process of using statistical methods to combine the results of dif¬ 
ferent studies. In the biomedical Kiences. the systematic, organized and structured 
evaluation of a problem of interest, using information (commonly in the form of 
statistical tables or other data) from a number of independent studies of the prob¬ 
lem. A frequent application has been the pooling or results from a number of small 
randomized controlled trials, none in itself large enough to demonstrate statistically 
significant differences, but in aggregate, capable or so doing. Meta-analysis has a 
qualitative component, i.e., application of predetermined criteria of quality (e g., 
completeness of data, absence of biases), and a quantitative component, i.e . inte¬ 
gration of the numerical information. Meu-analysis includes aspects of an overview, 
and of pooling of data, but implies more than either of these processes. Meta 
analysis carries the risk of several biases. 

methodology The Kiemific study of methods. Methodology should not be confused 
with methods. Sad to say, the word “methodology” is all too often used when the 
writer means “method.” 

miasma theory An explanation for the origin of epidemics, the “miasma theory” was 
implied by many ancient writers, and made explicit by Lancisi in Dr noxiu paludum 
effluxnis (1717). It was based on the notion that when the air was of a “bad quality” 
(a slate that was not precisely defined, but that was supposedly due to decaying 
organic matter), the persons breathing that air would become ill Malaria (“bad air”) 
is the classic example of a disease that was long attributed to miasmata. 'Miasma” 
was believed lo pass from cases to suKeptibles in these diseases considered conta¬ 
gious. 

Migrant studies Studies taking advantage or migration to one country by those from 
other countries with different physical and biological environments, cultural back¬ 
ground and/or genetic makeup, and different morbidity or mortality experience. 
Comparisons are made between the mortality or morbidity experience of the mi- 
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grant groups with that of their current country of residence and/or their country 
of origin. Sometimes the experiences of a number or different groups who have 
migrated to the same country have been compared. 

Mill’s CANONS In A Syste m of Lope (1856), J.5. Mill devised logical strategies (canons) 
from which causal relationships may be inferred. Four in particular are pertinent 
to epidemiology: the methods of agreement, difference, residues, and concomitant 
variation. 

Method of agreement (first canon): “If two or more instances of the phenomenon 
under investigation have only one circumstance in common, the circumstance in 
which alone all the instances agree, is the cause (or effect) of the given phenome¬ 
non.'’ 

Method of difference (second canon): “If an instance in which the phenomenon 
under investigation occurs, and an instance in which it does not occur, have every 
circumstance in common save one, that one occurring only in the former, the cir¬ 
cumstance in which alone the two instances differ is the effect, or cause or a nec¬ 
essary part of the cause, of the phenomenon.” 

Method of residues (fourth canon): "Subduct from any phenomenon such part as 
is known by previous inductions to be the effect of certain antecedents, and the 
residue of the phenomenon is the effect of the remaining antecedents.” 

Method of concomitant \v nation (fifth canon): “Whatever phenomenon varies in any 
manner whether another phenomenon varies in some particular manner, is either 
a cause or an effect of that phenomenon, or is connected with it through some fad 
of causation.” 

minimum data set (Syn: uniform basic data set) A widely agreed upon and generally 
accepted set of terms and definitions constituting a core of data acquired for med¬ 
ical records and employed for developing statistics suitable for diverse types of analyses 
and users. Such sets have been developed for birth and death certificates, ambula¬ 
tory care, hospital care, and long-term care. See also birth certificate; death 
CERTIFICATE; HOSPITAL DISCHARGE ABSTRACT SYSTEM. 

MISCLASS in CATION 7he erroneous classification of an individual, a value, or an at¬ 
tribute into a category other than that to which it should be assigned. The proba¬ 
bility of misclassification may be the same in all study groups (nondifferential mis- 
classification) or may vary between gToups (differential misclassification). 
mobility, geographic Movement of persons from one country or region to another. 
mobility, social Movement from one defined socioeconomic group to another, either 
upward or downward. Downward social mobility, which can be related to impaired 
health (e g., alcoholism, schizophrenia, or menial retardation), is sometimes re¬ 
ferred to as ’ social drift.” 

MODE One of the measures of central tendency. The most frequently occuring value 
in a set of observations. 

MODEL 

1. An abstract representation of the relationship between logical, analytical, or 

' empirical components of a system. See also mathematical model 

2. A formalized expression of a theory or the causal situation that is regarded as 
having generated observed data. 

3. (Animal) model: an experimental system that uses animals, because humans 

> ' cannot be used for ethical or other reasons. 

A. A small-scale simulation, c.g., by using an “average region” with characteristics 
resembling those of the whole country. 
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In epidemiology the use of models began with an effort to predict the onset and 
course of epidemics. In the second report of the Registrar-General of England and 
Wales (1840), william farr developed the beginnings of a predictive model for 
communicable disease epidemics. He had recognized regularities in the smallpox 
epidemics of the 1830s. By calculating frequency curves for these past outbreaks, 
he estimated the deaths to be expected. See also demonstration model; mathe¬ 
matical model; theoretical epidemiology. 

moderator variable (Syn: qualifier variable) In a study of a possible causal factor 
and an outcome, a moderator variable is a third variable exhibiting statistical inicr- 
action by virtue of its being antecedent or intermediate in the causal process under 
study. If it is antecedent, it is termed a conditional moderator variable or effect 
modifier; ir it is intermediate, it is a contingent moderator variable. See also inter¬ 
action; intermediate variable. 

MONITORING 

1. The performance and analysis of routine measurements, aimed at detecting 
changes in the environment or health status of populations. Not to be con¬ 
fused with surveillance. To some, monitoring also implies intervention in the 
light of observed measurements. 

2. Ongoing measurement of performance of a health service or a health profes¬ 
sional, or of the extent to which patients comply with or adhere to advice from 
health professionals. 

3. In management, the continuous oversight of the implementation of an activity 
that seeks to ensure that input deliveries, work schedules, targeted outputs, 
and other required actions arc proceeding according to plan. 

monotonic sequence A sequence is said to be monoionic increasing if each value is 
greater than or equal to the previous one, and monotonic decreasing if each value 
is less than or equal to the previous one. If equality of values is excluded, we speak 
of a strictly (increasing or decreasing) monotonic sequence. 

Monte Carlo study, trial Complex relationships that are difficult to solve by math¬ 
ematical analysis are sometimes studied by computer experiments that simulate and 
analyze a sequence of events, using random numbers. Such experiments are called 
Monte Carlo trials, or studies, in recognition of Monte Carlo as one of the gambling 
capitals of the world. 

morbidity Any departure, subjective or objective, from a state of physiological or psy¬ 
chological well-being. In this sense, sickness. illness, and morbid condition are similarly 
defined and synonymous (but see diselase). 

The WHO Expert Committee on Health Statistics noted in its Sixth report (1959) 
that morbidity could be measured in terms of three units: (I) persons who were ill; 
(2) the illnesses (periods or spells of illness) that these persons experienced; and (3) 
the duration (days, weeks, etc.) of these illnesses. See also health index; incidence 
rate; notifiable disease; prevalence rate. 

morbidity rate A term, preferably avoided, used indiscriminately to refer to incidence 
or prevalence rales of disease. 

morbidity survey A method for estimating the prevalence and/or incidence of disease 
or diseases in a population. A morbidity survey is usually designed simply m ascer¬ 
tain the facts as to disease distribution, and not to test a hypothesis. See also cross- 
sectional study; health survey. 

mortality rate See death mte. 

mortality STATUTtcs Statistical tables compiled from the information contained in death 
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certificates. Most administrative jurisdictions in all nations produce tables of mor¬ 
tality statistics. These may be published at regular intervals; they usually show num¬ 
bers of deaths and/or rates by age, sex, cause, and sometimes other variables. 

aruLTicoixiNEARtTY In multiple regression analysis, a situation in which at least some 
of the,/independent variables are highly correlated with each other. Such a situation 
can result in inaccurate estimates of the parameters in the regression model. 

MULTIFACTOR1AL ETIOLOGY Sec MULTIPLE CAUSATION. 

multinomial DISTRIBUTION The probability distribution associated with the classifica¬ 
tion of each of a sample of individuals into one of several mutually exclusive and 
exhaustive categories. When the number of categories is two, the distribution is 
called binomial. See also binomial distribution. 

MULTI PH ASIC SCREENING See SCREENING. 

multiple causation (Syn: multifactorial etiology) This term is used to refer to the 
concept that a given disease or other outcome may have more than one cause. A 
combination of causes or alternative combinations of causes may be required to 
produce the effect. 

MULTIPLE LOGISTIC MODEL See LOGISTIC MODEL. 

MULTIPLE risk Where more than one risk factor for the development of a disease or 
other outcome is present, and their combined presence results in an increased risk, 
we speak of “multiple risk/* The increased risk may be due to the additive effects 
of the risks associated with the separate risk factors, or to synergism. 

multiplicative model A model in which the joint effect of two or more causes is the 
product of their effects. For instance, if factor a multiplies risk by the amount a in 
the absence of factor b. and factor b multiplies risk by the amount 6 in the absence 
of factor a, the combined effect of factors a and b on risk is a x b. See also ADDmvt 
MODEL. 

multistage model A mathematical model, mainly for carcinogenesis, based on the 
theory that a specific carcinogen may affect one of a number of stages in the de¬ 
velopment of cancer. 

multivariate analysis A set of techniques used when the variation in several vari¬ 
ables has to be studied simultaneously In statistics, any analytic method that allows 
the simultaneous study of two or more dependent variables. 

mutation Heritable change in the genetic material not caused by genetic segregation 
or recombination, which is transmitted to daughter cells and to succeeding gener¬ 
ations, provided it is not a dominant lethal factor. 

mutation rate The frequency with which mutations occur per gene or per genera^ 
tion. 
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national death index A computerued central registry of deaths in the United States, 
started in 1979 and operated by the U.S. National Center for Health Statistics, that 
facilitates mortality followup; cF Canadian mortality data base, 
natural experiment A term probably derived from John snow’s account of his inves¬ 
tigation oT the practices of water supply companies in relation to the cholera epi¬ 
demics in London in the 1850s. It refers to naturally occurring circumstances in 
which populations have different exposures to a supposed causal factor in a situa¬ 
tion resembling an actual experiment in which persons would be assigned to groups. 

John Snow was able to trace the London outbreaks of cholera in the 19th century 
to water impurity as a result of comparisons made between two water companies. 
It would have been unethical to expose “test subjects” to infection, but the situation 
at the time afforded him the opportunity to make observations of crucial impor¬ 
tance. 

To turn this grand experiment to account, all that was required was to learn the 
supply of water to each individual bouse where a fatal attack of cholera might occur 
... I resolved to spare no exertion which might be necessary to ascertain the exact 
effect of the water supply on the progress of the epidemic, in the places where all 
the circumstances were so happily adapted for the inquiry ... I had no reason to 
doubt the correctness of the conclusions I had drawn from the great number of 
; facts already in my possession, but I felt that the circumstances of the cholera-poisoning 

passing down the sewers into a great river, and being distributed through miles of 
pipes, and yet producing its specific effects was a fact of so startling a nature, and 
of so vast importance to the community, that it could not be too rigidly examined 
or established on too firm a basis. (Snow, On 0i* Mod/ of Commumcaiion of Choirra. 
1855) 

natural history or disease The course of a disease from onset (inception) to reso¬ 
lution. Many diseases have certain well-defined stages that, taken all together, are 
i referred to as the “natural history of the disease” in question. These stages are as 

follows: 

!. Stage of pathological onset. 

2. Fresymptomatic stage: from onset to the first appearance of symptoms and/or 

! signs, screening tests may lead to earlier detection. 

5. Clinically manifest disease, which may progress inexorably to a fatal termina¬ 
tion, be subject to remissions and relapses, or regress spontaneously, leading 
to recovery. 

Detection and intervention can alter the natural history of disease. The term has 
also been used to mean “descriptive epidemiology of disease." 
natural history study A study, generally longitudinal, designed to yield information 
about the natural course of a disease or condition. 

^ natural rate or increase (decrease) See growth rate or population. 

nearest NEIGHBOR METHOD A means of analyring the spatial patterns of a freediving 
population. A term from veterinary epidemiology. Random sampling points are 
located throughout an area and the distance from each point to the nearest individ* 
■ ual is measured; alternatively, individuals are selected at random and from each of 

j these the distance to the nearest neighbor is measured. 
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necessary and sufficient cause A causal factor whose presence is required for the 
occurrence of the effect and whose presence is always followed by the effect. See 
also association; causality. 

needs (Syn: health needs, perceived needs, professionally defined needs, unmet needs) 
This term has both a precise and an all-bui-indeftnable meaning in the context of 
public health. We speak of needs in precise numerical terms when we refer to 
specific indicators of disease or premature death that require intervention because 
their level is above that generally accepted in the society or community in question. 
For example, an infant mortality rate two or three times greater than the national 
average in a particular community is an indicator of unmet health needs of infants 
in that community (not to be confused with a need for more or better medical care). 
It should be clear that even in this seemingly precise usage there are implied value 
judgments. It must be explicitly stated that “needs” always reflect prevailing value 
judgments as well as the existing ability to control a particular public health prob¬ 
lem. Thus, sputum-positive pulmonary tuberculosis was not recognized as a health 
need in 1850 but was by 1900 in the industrialized nations; the ill effects of ciga¬ 
rette smoking must now be universally acknowledge as a health need; and child 
abuse is increasingly regarded as a public health problem, to which we could apply 
the term “professionally defined need." 

(See Vickers GR: What sets the goals of public health. Lancet 1:599, 1958.) 

neonatal mortality rate 

1. In vital statistics, the number of deaths in infants under 28 days of age in 
a given period, usually a year, per 1000 live births in that period. 

2. In obstetric and perinatal research the term “neonatal mortality rate” is often 
used to denote the cumulative mortautt rate of live-born infants within 28 
days of age. 

nested case control STUDY A case control study in which cases and controls are drawn 
from the population in a cohort study. As some data are already available about 
both cases and controls, the effects of some potential confounding variables are 
reduced or eliminated. 

net migration The numerical difference between immigration and emigration. 

net migration rate The net effect of immigration and emigration on an area’s pop¬ 
ulation expressed as an increase or decrease per 1000 population of the area in a 
given year. 

net reproduction rate The average number of female children bom per woman in 
a cohort subject to a given set of age-specific fertility rates, a given set of age-specific 
mortality rates, and a given sex ratio at birth. This rale measures replacement fer¬ 
tility under given conditions of fertility and mortality: it is the ratio of daughters to 
mothers assuming continuation of the specified conditions of fertility and mortality. 
It is a measure of population growth Trom one generation to another under con¬ 
stant conditions. This rate is similar to the gross reproduction rate, but takes into 
account that some women will die before completing their childbearing years. An 
NRR of 1.00 means that each generation of mothers is having exactly enough 
daughters to replace itself in the population. See also gross reproduction rate. 

New York State Identification and Intelligence Stitem (NYSIIS) A method of 
identifying individuals for record linkage based on phonetic spelling of full names, 
sequence oT digits for birthdate, birthplace, sex, name at birth, and parents' names. 
See also hocben number; soundex code. 

nidus A focus of infection. The term can be used to describe any heterogeneity in the 
distribution or a disease, but is usually applied to a small area in which conditions 

CTt>ZTSCS0Z 


favor occurrence and spread of a communicable disease; also, the site of origin of 
a pathological process. 

Nightingale, Florence (1820-1910) An English woman who is identified as the founder 
of modern nursing, but was much more. In addition to her famous work of elevat¬ 
ing nursing to a noble profession during the Crimean War, and establishing a train¬ 
ing school for nurses at St. Thomas’s Hospital in London, she recognized the im¬ 
portance of statistical analysis of hospital records {Notes on Hospitals London: 
Longmans, 1859); her contributions were recognized by election to Fellowship of 
the Royal Statistical Society. Her best-known work is Nous on Nursing (I860). 

noise (in data) This term is used when extraneous uncontrolled variables and/or er¬ 
rors influence the distribution of measurements that are made in a study, thus 
rendering difficult or impossible the determination of relationships between vari¬ 
ables under scrutiny. 

nomenclature A list of alt approved terms for describing and recording observations. 

NOMINAL SCALE See MEASUREMENT SCALE. 

nomogram a form of line chan showing scales for the variables involved in a particular 
formula in such a way that corresponding values for each variable lie on a straight 
line intersecting all the scales. 






Nomogram of confidence limits to a rale. 

Frxm Rosenbaum, Nomograms for rates per 1000, BrMed j 1:169-170, 1963. 
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NON CONCURRENT STUDY See HISTORICAL COHORT STUDY. 

NON DIFFERENTIAL M ISC LASS I FI CATION See M ^CLASSIFICATION. 

NON EXPERIMENTAL STUDY See OBSERVATIONAL STUDY. 

NONPARAMETRIC METHODS See DISTRIBUTION-FREE METHOD. 

NONPARAMETRIC TEST See DISTRIBUTION-FREE METHOD. 

non participants (Syn: nonresponders) Members of a study sample or population who 
do not take part in the study for whatever reason, or members of a target popula¬ 
tion who do not participate in an activity. Differences between participants and 
nonpanicipams have been demonstrated repeatedly in studies of many kinds, and 
this is often a source of bias. 

no-observed-eftect level (noel) A term from toxicology, meaning the highest dose 
at which no adverse health effects arc detected in an animal population. A NOEL* 
SF is a no^observedTeffects level with an added safety factor for human exposures, 
used in setting human safety standards. 

norm This term has two quite distinct meanings; 

1. The first is "what is usual/' e.g., the range into which blood pressure values 
usually fall in a population group, the dietary or infant feeding practices that 
are usual in a given culture, or the way that a given illness is usually treated 
in a given health care system. 

2. The second sense is "what is desirable," e.g., the range of blood pressures that 
a given authority regards as being indicative of present good health or as 
predisposing to future good health, the dietary or infant feeding practices that 
arc valued in a given culture, or the health care procedures or facilities for 
health care that a given authority regards as desirable. 

In the latter sense, norms may be used as criteria when evaluating health care, in 
order to determine the degree of conformity with what is desirable, the average 
length of stay of patients in hospital, etc. A distinction is sometimes made between 
norms, defined as quantitative indexes based on research, and standards, which are 
fixed arbitrarily. 

normal This term has three distinct meanings. Conceptual difficulties may arise if these 
different meanings are not specified, or if the area of their overlap is not clearly 
understood. 

1. Within the usual range of variation in a given population or population group; 
or frequently occurring in a given population or group. In this sense, "nor¬ 
mal" is frequently defined as, “within a range extending from two standard 
deviations below the mean to two standard deviations above the mean,” or 
“between specified (e g . the 10th and 90th) percentiles of the distribution.” 

2. In good health, indicative or predictive of good health, or conducive to good 
health. For a diagnostic or screening test, a "normal” result is one in a range 
within which the probability of a specific disease is low (see also normal lim¬ 
its). 

3. (Of a distribution) Gaussian; see also normal distribution. 

normal distribution (Syn: Gaussian distribution) The continuous frequency distri¬ 
bution of infinite range represented by the equation 
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where x is the abscissa,/(x) is the ordinate, p is the mean, l is the natural logarithm, 
2.718 and cr the standard deviation. 
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Standard deviations 

Normal distribution of heart rate. From Rimm et at., 1980. 

The properties of a normal distribution include the following; (I) It is a contin¬ 
uous. symmetrical distribution; both tails extend to infinity; (2) the arithmetic mean, 
mode, and median are identical; and (3) its shape is completely determined by the 
mean and standard deviation. 

normal limits The limits of the "normal” range of a test or measurement, in the sense 
or being indicative of or conducive to good health. One way to determine normal 
limits is to compare the values obtained when the measurements are made in two 
groups, one that is healthy and has been found to remain healthy, the other ill, or 
subsequently found to become ill. The result may be two overlapping distributions, 
as illustrated. Outside the area where the distributions overlap, a given value clearly 
identifies the presence or absence of disease or some other manifestation of poor 
health. If a value falls into the area of overlap, the individual may belong to either 
the normal or the abnormal group. The choice of the normal limits depends upon 
the relative importance attached to the identification of individuals as healthy or 
unhealthy. See also false negative; fals£ positive; sensitivity and specificity. 




Hypothetical distribution of normal and diabetic glucose levels. 

From Lilienfeld and Ijlienfeid, 1979. 

normative Pertaining to the normal, usual, accepted standard or values. See also norm, 
nosocomial Arising while a patient is in a hospital or as a result of being in a hospital; 
relating to a hospital; denoting a new disorder (unrelated to the patients primary 
condition) associated with being in a hospital. 
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nosocomujl INFECTION (Syn: hospital-acquired infection) An infection originating in a 
medical facility, c.g., occurring in a patient in a hospital or other health care facility 
in whom the infection was not present or incubating at the time of admission. In¬ 
cludes infections acquired in the hospital but appearing after discharge; it also in- 
eludes such infections among staff. 

nosography, nosolocy Classification of ill persons into groups, whatever the criteria 
for their classification, and agreement as to the boundaries of the groups, is called 
"nosology." The assignment of names to each disease entity in the group results in 
a nomenclature of disease entities, or nosography. (Faber K: Nosography tn Mode m 
Inlrmai Mediane. New York: Hoebcr, 1923.) 
notifiable duease A disease that, by statutory requirements, must be reported to the 
public health authority in the pertinent jurisdiction when the diagnosis is made. 

A disease deemed of sufficient importance to the public health to require that its 
occurrence be reported to health authorities. 

The reporting to public health authorities of communicable diseases is, unfortu¬ 
nately. vert incomplete. The reasons for this include diagnostic inexactitude; the 
desire of patients and physicians to conceal the occurrence of conditions carrying a 
social stigma, e g,, sexually transmitted diseases; and the indifference of physicians 
to the usefulness of information about such diseases as hepatitis, influenza, and 
measles. Yet notifications are extremely important. They provide the suiting point 
for investigations into the failure of preventive measures such as immunizations, 
for tracing sources of infection, for finding common vehicles of infection, for de¬ 
scribing the geographic clustering of infection, and for various other purposes, de¬ 
pending upon the particular disease. 

N4., n.*. Abbreviation, usually written lower case, for not sutistically significant. 
null hypothesis (Syn: test hypothesis) The statistical hypothesis that one variable has 
no association with another variable or set of variables, or that two or more popu¬ 
lation distributions do not differ from one another. In simplest terms, the null 
hypothesis sutes that the results observed in a study, experiment, or test are no 
different from what might have occurred as a result of the operation of chance 
alone. 

numerator The upper portion of a fraction used to calculate a rate or a ratio. 
numerical taxonomy The construction of homogeneous groupings or taxa using nu¬ 
merical methods; allied to cluster analysis. 
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observational BTVOY (Syn: noncx peri menu) study, survey) Epidemiologic study in 
situations where nature is allowed to take its course; changes or differences in one 
characteristic arc studied in relation to changes or differences in other(s), without 
the intervention of the investigator. 

observer variation (error) Variation (or error) due to failure of the observer to 
measure or to identify a phenomenon accurately. Observer variation erodes scien¬ 
tific credibility whenever it appears. Sir Thomas Browne in Pseudodoxta Eptdemtca 
(1646). subtitled "Enquiries into very many commonly received tenents and pre¬ 
sumed truths," recognized several sources of error: "the common infirmity of hu¬ 
man nature, the erroneous disposition of the people, misapprehension, fallacy or 
false deduction, credulity, obstinate adherence to authority, the belief in popular 
conceits, the endeavours of Satan." 

All observations are subject to variation. Discrepancies between repeated obser¬ 
vations by the same observer and between different observers are to be expected; 
these can be diminished hut probably never absolutely eliminated. 

Variation may arise from several sources. The observer may miss an abnormality 
or think he has found one where none is present; a measurement or a test may 
give incorrect results due to faulty technique or incorrect reading and recording of 
the results; or the observer may misinterpret the information. Two varieties of ob¬ 
server variation are interobserver variation, i.e., the amount observers vary from 
one another when reporting on the same material, and intraobserver variation, the 
amount one observer varies between observations when he reports more than once 
on the same material. 

Occam’s Razor (Syn: scientific parsimony) William of Occam’s 14th century dictum 
was that, "the assumptions introduced to explain a thing must not be multiplied 
beyond necessity." This useful maxim does not contradict the conclusion that mul¬ 
tiple causes operate in any system. The number of causes implicated depends on 
the frame of reference of the investigator and on the scope of the inquiry. 

occurrence (Syn: frequency) In epidemiology, a general term describing the fre¬ 
quency of a disease or other attribute or event in a population without distinguish¬ 
ing between incidence and prevalence. 

odds The ratio of the probability of occurrence of an event to that or nonoccurrence, 
or the ratio of the probability that something is so, to the probability that it is not 
so. If 60 smokers develop a chronic cough and 40 do not, the odds among these 
100 smokers in favor of developing a cough are 60:40, or 1.5; this may be con¬ 
trasted with the probability that these smokers will develop a cough, which is 60/ 
100 or 0.6. 

odds ratio (Syn: cross-product ratio, relative odds) The ratio of two odds. The term 
"odds" is defined differently according to the situation under discussion. Consider 

PI 
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the following notation for the distribution of a binary exposure and a disease in a 
population or a sample. 

Exposed Unexposed 
Disease a b 

No disease c d 

The odds ratio (cross-product ratio) is ad/bc. 

The exposure-odds ratio for a set of case control data is the ratio of the odds in 
favor of exposure among the cases (a/6) to the odds in favor of exposure among 
. noncases (c/d). This reduces to ad/bc. With incident cases, unbiased subject selection, 
and a ’’rare” disease (say, under 2% cumulative incidence rate over the study pe¬ 
riod). ad/bc is an approximate estimate of the risk ratio. With incident cases, un¬ 
biased subject selection, and density sampling of controls ad/bc is an estimate or 
the ratio of the person-lime incidence rates ( forges or morbidity) in the exposed 
and unexposed (no rarity assumption is required for this). 

The disease-odds (rate-odds) ratio for a cohort or cross section is the ratio of the 
odds in favor of disease among the exposed (a/c) to the odds in favor of disease 
among the unexposed bid). This reduces to ad/bc and hence is equal to the expo¬ 
sure-odds ratio for the cohort or cross section. 

The pre\Hilmce-odds ratio refers to an odds ratio derived cross sectional)), as, for 
example, an odds ratio derived from studies of prevalent (rather than incident) 
cases. 

The nsk-odds ratio is the ratio of the odds in favor of getting disease, if exposed, 
to the odds in favor of getting disease if not exposed. The odds ratio derived from 
a cohort study is an estimate of this. See also case control study. 

one-tail ttst A statistical significance test based on the assumption that the data have 
only one possible direction of variability. 

operational research The systematic study, by observation and experiment, of the 
working of a system, e.g., health services, with a view to improvement. 

operations research 

1. The fitting of models to data, or the designing of models. 

2. Svnonym for operational research. 

opportunistic infection Infection with organism(s) that are normally innocuous, e.g., 
commensals in the human, but become pathogenic when the body’s immunologic 
defenses are compromised, as happens in the acquired immunodeficiency syn¬ 
drome (AIDS). 

ordinal scale See measurement scale. 

ordinate The distance of a point, P, from the horizontal or x axis of a graph, mea¬ 
sured along the vertical or j axis. See also abscissa; craph. 

outcomes All the possible results that may stem from exposure to a causal factor, or 
from preventive or therapeutic interventions; all identified changes in health status 
arising as a consequence of the handling of a health problem. See also causality; 
causation of disease, factors in. 

outliers Observations differing so widely from the rest of the data as to lead one to 
suspect that a gross error may have been committed, or suggesting that these values 
come from a different population. 

outbreak (Syn: epidemic) Sometimes the preferred word, as it may escape sensation¬ 
alism associated with the word epidemic. Alternatively, a localized as opposed to 
generalized epidemic. 

output The immediate result of professional or institutional health care activities, usu- 

9Tt-2TeC202 



93 overwintering 

ally expressed as units of service, c.g., patient hospital days, outpatient visits, labo¬ 
ratory tests performed. 

overmatching A situation that may arise when groups are matched. Several varieties 
can be distinguished: 

1. The matching procedure partially or completely obscures evidence of a true 
causal association between the independent and dependent variables. Over¬ 
matching may occur if the matching variable is involved in, or is closely con : 
nectcd with, the mechanism whereby the independent variable affects the de¬ 
pendent variable. The matching variable may be an intermediate cause in the 
causal chain or it may be strongly affected by, or a consequence of, such an 
intermediate cause. 

2. The matching procedure uses one or more unnecessary matching variables, 
e.g., variables that have no causal effect or influence on the dependent vari¬ 
able, and hence cannot confound the relationship between the independent 
and dependent variables. 

3. The matching process is unduly elaborate, involving the use of numerous 
matching variables and/or insisting on very close similarity with respect to spe¬ 
cific matching variables. This leads to difficulty in finding suitable controls. 
See also matchinc. 

overwintering See vector-borne infection. 
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P, P (probability) value The probability that a test statistic would be as extreme as 
or more extreme than observed if the null hypothesis were true. The letter P, fol¬ 
lowed by the abbreviation n.s. (not significant) or by the symbol < (less than) and a 
decimal notation such as 0.01,0.05, is a statement of the probability that the differ¬ 
ence observed could have occurred by chance, if the groups are really alike, i.c., 
under the null hypothesis. 

Investigators may arbitrarily set their own significance levels, but in most biomed¬ 
ical and epidemiologic work, a study result whose probability value is less than 5% 
(P<0.05) or 1% (P<0.01) is considered sufficiently unlikely to have occurred by 
chance to justify the designation '‘statistically significant." See also statistical sig¬ 
nificance. 

paired samples In a clinical trial, pairs of subject patients may be studied. One 
member of each pair receives the experimental regimen, and the other receives a 
suitably designated control regimen. Pairing should be based on a prognostic vari¬ 
able such as age 

Pairing may similarly be used in a case control study or in a cohort study. 
See also matching. 

pandemic An epidemic occurring over a very wide area and usually affecting a large 
proportion of the population. 

panel study A combination of cross-sectional and cohort methods, in which the inves¬ 
tigator conducts a series of cross-sectional studies of the same individuals or study 
sample. This method of study permits changes in one variable to be related to 
changes in other variables. See also nested case^control study. 

Panum, Peter Ludwig (1820-1885) A Danish physician who observed firsthand an 
epidemic of measles in the Faroe Islands in 1846. This was the first outbreak there 
for many years, and from the epidemic pattern, Panum deduced some basic, pre¬ 
viously unknown details about the method of spread, and incubation period, the 
lasting immunity (hat followed infection, and the relationship between age and se¬ 
verity of infection. 

paradigm A typical example, a pattern of thought or conceptualization; an overall way 
of regarding phenomena, within which scientists normally work. A paradigm may 
dictate what form of explanation will be found acceptable, but a science may change 
paradigms. In many contexts in which il is used, the term is both ambiguous and 
vague.' The word is often used loosely as a synonym for “factor" or “variable." 

' Kuhn T. The Structure of Scientific Revolutions. Chicago: University of Chicago Press, 1962. 

parameter In mathematics, a constant in a formula or model; in statistics and epide¬ 
miology, a measureable characteristic of a population. 

parametric TErr A statistical test that depends upon assumption(s) about the distri¬ 
bution of the data, e g., that these are normally distributed. 



parasite An animat or vegetable organism that lives on or in another and derives its 
nourishment therefrom. An obligate parasite is one that cannot lead an indepen¬ 
dent nonparasitic existence. A facultative parasite is one that is capable of either 
parasitic or independent existence. 

parasite count See WORM count. 

parasite density The collective degree of parasitemia in a population, calculated by 
the use of either the geometric mean or the weighted average of the individual 
parasite counts; e.g., by using a frequency distribution based on a geometric pro¬ 
gression. 

paratenic host (Syn: transport host) A second, third, or subsequent intermediate host 
of a parasite, in which the parasite does not undergo any development or replica¬ 
tion, but remains, usually encysted, until the paratenic host is ingested by the defin¬ 
itive host of the parasite. 

parity The status of a woman as regards the fact of having borne viable children. The 
number of full-term children previously borne by a woman, excluding miscarriages 
or abortions in early pregnancy, but including stillbirths. 

particularization A method of analysis opposite to generalization or abstraction. It 
focuses on the specificity of a number of facts and illustrates an issue through the 
use of example. 

passage The transfer of micro-organisms from human to animal host(s) either directly 
or via laboratory culture; in (he laboratory, this procedure is used to establish the 
Henle-Koch postulates. 

passenger variable A variable that varies systematically with the dependent variable 
under study, without being causally related to it; a third (explanatory) variable, the 
common cause of both the dependent and the passenger variable, "explains" or 
accounts for their association. 

PASSIVE SMOKING See INVOLUNTARY SMOKING 

Pasteur, Louis (1822-1895) A French chemist and biologist. One of the founders of 
bacteriology and therefore an important figure also in epidemiology. Starting in 
chemistry, he worked out the biological basis for fermentation, and then went on 
to make many important discoveries in bacteriology, notably vaccines against an¬ 
thrax and rabies. He is, of course, eponymously honored by the word "pasteuriza¬ 
tion." 

path analysis A mode of analysis involving assumptions about the direction of causal 
relationships between linked sequences and configurations of variables. I bis per¬ 
mits the analyst to construct and test the appropriateness of alternative models (in 
the form of a path diagram) of the causal relations that may exist within the array 
of variables included in the finite system studied. Identification of the less probable 
sequences of causal pathways may permit them to be eliminated from further con¬ 
sideration. 

pathogen Organism capable of causing disease (literally, causing a pathological pro¬ 
cess). 

pathogenesis The postulated mechanisms by which the etiologic agent produces dis¬ 
ease. The difference between etiology and pathogenesis should be noted: The 
etiology of a disease or disability consists of the postulated causes that initiate (he 
pathogenetic mechanisms; control of these causes might lead to prevention of the 
disease. 

pathogenicity The property of an organism that determines the extent to which overt 
disease is produced in an infected population, or the power of an organism to 
produce disease. Also used to describe comparable propel ties of toxic chemicals. 
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etc. Pathogenicity of infectious agents is measured by the ratio of the number of 
persons developing clinical illness to the number exposed to infection. See also vir¬ 
ulence, with which pathogenicity is sometimes confused. 

Pearson, Karl (J857-1936) British mathematician, biologist and geneticist. Pearson 
was a pupil of Francis Gallon, who led the science of statistics further into applica¬ 
tions in biology and genetics. He founded the journal Bumetriha, coined the word 
“biometry.” and taught the next generation of statistician/epidemiologists, including 
Major Greenwood, Raymond Pearl, and others. 
pearson's product moment correlation See correlation coefficient, 
pedigree A diagram showing the ancestral relationships and transmission of genetic 
traits over several generations of a family. 

peer review Process of review of research proposals, manuscripts submitted for pub¬ 
lication, abstracts submitted for presentation at scientific meetings, whereby these 
are judged for scientific and technical merit by other scientists in the same field. 
Also refers to review of clinical performance, when it is a form of medical audit. 
penetrance The frequency, expressed as a percentage, with which individuals of a 
given phenotype manifest at least some degree of a specific mutant phenotype as¬ 
sociated with a trail. See also genetic penetrance, 
perceived need A felt need. The term usually refers to need for health care that is 
felt by the person or community concerned, but which may not be perceived by 
health professionals. 

percentile The set of divisions that produce exactly 100 equal parts in a series of 
continuous values, such as children's heights or weights. Thus a child above the 
90th percentile has a greater value for height or weight than over 90% of all in the 
series. 

perinatal mortality Literally, mortality around the time of birth. Conventionally this 
time is limited to the period between 28 weeks gestation and one week postnatal. 
However, as the following discussion indicates, other factors, especially the weight 
of the Tetus, should be considered. The Ninth (1975) Revision of the International 
Classification of Diseases includes the following; 

Perinatal mortality statistics 

It is recommended that national perinatal statistics should include all fetuses and 
infants delivered weighing at least 500 g (or, when birth weight is unavailable, the 
corresponding gestational age (22 weeks] or body length [25 cm crown-heel]), 
whether alive or dead. It is recognized that legal requirements in many countries 
may set different criteria for registration purposes, but it is hoped that countries 
will arrange the registration or reporting procedures in such a way that the events 
required for inclusion in the statistics can be identified easily. It is further recom¬ 
mended that less mature fetuses and infants should be excluded from perinatal 
statistics unless there are legal or other valid reasons to the contrary. 

It is recommended above that national statistics would include fetuses and infants 
weighing between 500 g and 1000 g both for their inherent value and because their 
inclusion improves the completeness of reporting at 1000 g and over. 

Inclusion of this group of very immature births, however, disrupts international 
comparisons because of differences in national practices concerning their registra¬ 
tion. Another factor affecting international comparisons is that all live-born infants, 
irrespective of birth weight, are included in the calculation of rates, whereas some 
lower limit of maturity is applied to infants born dead. 

In order to eliminate these factors, it is recommended that countries should pres- 
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ent, solely for international comparisons, “standard perinatal statistics” in which 
both the numerator and denominator of all rales are restricted to fetuses and in¬ 
fants weighing 1000 g or more (or, where birth weight is unavailable, the corre¬ 
sponding gestational age (28 weeks) or body length (25 cm crown-heel]). 

perinatal mortality rate In most industrially developed nations, this is defined as 

Fetal deaths (28 weeks + of 
gestation) + postnatal 

Perinatal m deaths (first week} x 
mortality rate Fetal deathL(28 weeks + of 
gestation) + live births 

The World Health Organization's definition, more appropriate in nations with less 
well-established vital records, is 

Late fetal deaths (28 
weeks + of gestation) + 

Perinatal postnatal deaths (first week) 

e — . .. . — ■ — - — X 1000 

mortality rate Dve births tn a year 

Note the differences in denominator of the perinatal mortality rate as defined by 
WHO and in industrially developed nations. This makes international comparison 
difficult. The WHO Expert Committee on the Prevention of Perinatal Mortality 
and Morbidity (1970) recommended a more precise formulation: v Late Tetal and 
earlv neonatal deaths weighing over 1000 g at birth expressed as a ratio per 1000 
live births weighing over 1000 g at birth.* 1 

periodic (medical) EXAMINATIONS Assessment of health status conducted at predeter¬ 
mined intervals, e.g., annually or at specified milestones in life such as infancy, 
school entry, preemployment, or preretirement. This form of medical examination 
generally follows a formal protocol, e g., employing a set of structured questions 
and/or a predetermined set of laboratory tests. 

PERIOD OF COMMUNICABILITY See COMMUNICABLE PERIOD. 

permissible exposure limit (pel) An occupational health standard to safeguard em¬ 
ployees against dangerous chemicals or contaminants in the workplace. See safety 
STANDARDS. 

personal health care Those services to individuals that are performed on a one-to- 
one basis by a health care worker for the purpose oT maintaining or restoring health. 

personal monitoring device An instrument attached to a person to measure the ex¬ 
posure of that person to hazardous substance(s). 

person-time A measurement combining persons and lime, used as denominator in 
person-time incidence and mortality rates. It is the sum of individual units of time 
that the persons in the study population have been exposed to the condition of 
interest. A variant is person-distance, eg., as in passenger-kilometers. The most 
frequently used person-time is person-years. With this approach, each subject con¬ 
tributes only as many years oT observation to the population at risk as he is actually 
observed; if he leaves after one year, he contributes one person-year; if after ten, 
ten person-years. The method can be used to measure incidence over extended and 
variable time periods. 

person-time incidence rate (Syn: interval incidence density) A measure of the inci¬ 
dence rate of an event, e.g., a disease or death, in a population at risk, pivrn fiv 
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Number of cvenU occurring during the interval 
Number of person-time units at risk observed 
during the interval 


person-to-person spread or DISEASE (Syn: prosodemic) See transmission of infec¬ 
tion. 

PERSON-YEARS See PERSON-TIMt. 

Petty, William (1623-1687) A member of the same circle as John Graunt. he is equally 
recognized as a pioneer in vital statistics and economics. His ideas and concepts of 
lifetime earning capability are contained in Political Arithmetic (London, 1691). 

pharmacoepidemiology The study of the distribution and determinants of drug-related 
events in populations, and the application of this study to efhcaceous drug treat¬ 
ment. 

physician (Syn: medical practitioner, doctor) Professional person qualified by educa¬ 
tion and authorized by law to practice medicine. 

PIE CHART A circular diagram divided into segments, each representing a category or 
subset of data. The amount for each category is proportional to the angle sub¬ 
tended at the center of the circle and hence to the area of the sector. 

When several pie charts are used to describe several populations, the area of each 
circle is proportional to the size of the population it represents. 

PILOT investigation, STUDY A small-scale test of the methods and procedures to be 
used on a larger scale if the pilot study demonstrates that these methods and pro¬ 
cedures can work. 

placebo, placebo EFFECT An inert medication or procedure. The placebo effect (usu¬ 
ally but not necessarily beneficial) is attributable to the expectation that the regimen 
will have an effect, i.c., the effect is due to the power of suggestion. See also halo 

EFFECT. 

POINT SOURCE EPIDEMIC Sec EPIDEMIC, COMMON SOURCE. 

Poisson distribution A distribution function used lo describe the occurrence of rare 
events or to describe the sampling distribution of isolated counts in a continuum of 
lime or space (e.g., sample counts of radioactive disintegration per minute). The 
number of events has a Poisson distribution with parameter X (lambda) if the prob¬ 
ability of observing * events (A*0, I, . . .) is equal to 





where e is the base of natural logarithm, 2.7183. . . . The mean and variance of 
the distribution are both equal to X. This distribution is used in modeling person- 
time incidence rales. 

pollution Any undesirable modification of air, water, or food by substance(s) that are 
toxic or may have adverse effects on health or that are offensive though not nec¬ 
essarily harmful lo health. 

POLYGENIC INHERTTANCE The transmission of a phenotypic trait whose expression de- 
, pends upon the additive errect of a number of genes. 

ponderal index The anthropometric index of body mass. Defined as height divided 
by the cube root of the body weight. The body mass index is generally regarded as 
a better index of body mass. 
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65 + 
years 




Pie charts of age structure of the population 
(Figures outside the circle show the population in millions.) 
From World Health Organization. 


POPULATION 

I AH lhe '"habitanu of a given country or area considered together; the number 
of inhabitants of a given country or area. 

2. (In sampling) The whole collection of uniu from which a sample may be drawn; 
not necessarily a population of persons; the units may be institutions, records, 
or events. The sample is intended to give results that are representative of the 
whole population. 

POPULATION attributable RUR (par) This term is used by many epidemiologists'** 
in preference to the terms “attributable fraction (population)* 1 or “etiologic fraction 
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(population).” It is the incidence of a disease in a population that is associated with 
(attributable to) exposure to the risk factor. It is often expressed as a percentage. 
It is calculated by simitar methods to those described for attributable fraction (pop¬ 
ulation), i.e., 

PAR% - x 100 

P, x /, 

where P € = number of persons exposed 
P t * persons in the population 
I r »incidence rate among the exposed 
/* * incidence rate among the unexposed 
/, «incidence rate for the total population 

In a case-control study, PAR can be estimated in various ways; Cole and Mac- 
Mahon 3 give the following formula: 


PAR% 


p<m-\) 

I+/MW?-l) 


x 100 


where P t ■ proportion of controls exposed 

/?/?** relative risk for exposed, compared to risk of 1 for the unexposed. 

1 Mac Mahon B, Pugh TF: Eprdemtoio jp; PrtnopUs and Methods. Boston: Utile, Brown, 1970. 
’Fletcher RH, Fletcher SVV, Wagner EH: CUnual Epidemiology—the Essentials. Baltimore: Williams 
L Wilkins, 1982. 

5 Gole P. Mac Mahon B: Attributable risk percent in case-control studies. Brit J Prrv Soe Med 25:242- 
244. 1971. 


population attributable rjsk percent This is the attributable fraction in the pop¬ 
ulation, expressed as a percenuge. See also attributable fraction (pofulation). 
population based Pertaining to a general population defined by geopolitical bounda¬ 
ries; this population is the denominator and/or the sampling frame. 
population dynamics Changes in the structure or a population; loosely used as a syn¬ 
onym for demography. 

population excess bate A measure of the amount of disease associated with exposure 
to a putative cause of the disease in the population. It is the difference between the 
rates of disease in the entire population and among the nonexposed. 
population medicine See community medicine. 

population momentum In a growing population, the phenomenon of continuing pop¬ 
ulation growth beyond the time when replacement level fertility has been achieved, 
because of the increasing size of child-bearing and younger age cohorts, resulting 
from higher fertility and/or falling mortality in preceding years. 
population pyramid A graphic presentation of the age and sex composition of the 
population. The population pyramid is constructed by computing the percentage 
distribution of a population, simultaneously cross-classified by sex and age. The 
percentage that each female age group is of the total is plotted on the right and the 
corresponding percentages for males are plotted on the left. A population pyramid 
is intended to provide a quick overall comprehension of age and sex structure in 
the population. A population whose pyramid has a broad base and narrow apex 
may be identified as a high fertility population. Changing shape over lime reflects 
the changing composition of the population, associated with changes in fertility and 
mortality at each age. 

Since the figure is two dimensional, the word “pyramid" is incorrectly used, but 
the more accurate word “profile" has never caught on. 
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Population pyramids. 

Top: High fertility, low proportion survive to old age (Mexico). 
Bottom: Low fertility, high proportion survive to old age (Sweden). 
From Last, I960. 


population, study The group selected for investigation. 

population, target The group from which a study population is selected. 

POSTERIOR ODDS, posterior probability Probability calculated after reference to re¬ 
sults of a study. See Bayes theorem. 

postmarjuttinc surveillance A procedure implemented after a drug has been lic¬ 
enced for public use, designed to provide information on the actual use of the drug 
for a given indication, and on the occurrence of side effects, adverse reactions, etc 
A method for epidemiologic study of adverse drug reactions. 

IVP^ONATAL mortaljty rate The number of infant deaths between 28 days and 
one year of age in a given year per 1000 live births in that v^r li i< * n 
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rale lo monitor in developing countries where older infant* frequently die of infec¬ 
tions and malnutrition. 

potency The strength of a particular drug, toxin, or hazard; the ratio of the dose of a 
standard amount required to elicit a specific response, lo the dose of the test agent 
that elicits the same response. 

potential years or life lost (mx) A measure of the relative impact of various 
diseases and lethal forces on society. PYLL highlights the loss to society a) a result 
of youthful or early deaths. The figure for potential years of life lost due lo a 
particular cause is the sum, over all persons dying from that cause, of the years that 
these persons would have lived had they experienced normal life expectation. The 
concept derives from Petty’s Political Arithmetic (1687) and is elaborated upon in 
Dublin and Lotka’s Money Value of a Man (1930). 

power A characteristic of a statistical hypothesis test, denoting the probability that the 
null hypothesis will be rejected if it is indeed false. It is equal to 1 minus the prob¬ 
ability of type II error. See also error, resolution. Resolving power is the com¬ 
parable property of individual measurements. 

pragmatic study A study whose aim is to improve health status or health care of a 
specified population, provide a basis for decisions about health care, or evaluate 
previous actions. See also explanatory study; community diagnosis; program 

REVIEW. 

PRECISION 

1. The quality of being sharply defined or stated. One measure of precision is 
the number of distinguishable alternatives from which a measurement was 
selected, sometimes indicated by the number of significant digits in the mea¬ 
surement. Another measure of precision is the standard error of measure¬ 
ment, the standard deviation of a series of replicate determinations of the 
same quantity. Precision does not imply accuracy. Sec also measurement, 

PROBLEMS WITH TERMINOLOGY. 

2. In statistics, precision is defined as the inverse of the variance of a measure- 
meni or estimate. 

precursor An early stage in the course of a disease, or a condition or state preceding 
pathological onset of a disease; sometimes detectable by screening; may be identi¬ 
fied as a risk marker. 

predictive value In screening and diagnostic tests, the probability that a person with 
a positive test is a true positive (i.e., does have the disease) is referred to as the 
"predictive value of a positive test." The predictive value of a negative test is the 
probability that a person with a negative test does not have the disease. The predic¬ 
tive value of a screening test is determined by the sensitivity and specificity of the 
test, and by the prevalence of the condition for which the test is used. See also 
SCREENING; SENSITIVITY AND SPECIFICITY. 

premunition A term used mainly in the epidemiology of parastic diseases, especially 
malaria. It signifies a state of resistance, in a host harboring a parasite, lo superin- 
fection by a parasite of the same species. This stale is dependent on the continued 
survival of parasites in the body and disappears after their elimination. It may be 
complete or partial. 

prepatent period In parasitology, the period equivalent to the incubation period of 
microbial infections; the corresponding phase may be biologically different from 
microbial multiplication when the invading organism is a multicellular parasite that 
undergoes developmental stages in the host. 

PRESCRIPTIVE SCREENING See SCREENING. 
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prevalence The number of instances of a given disease or other condition in a given 
population at a designated time; sometimes used to mean prevalence rate. When 
used without qualification the term usually refers to the situation at a specified 
point in lime (point prevalence). 

prevalence, annum! (An occasionally used index) The total number of persons 
with the disease or attribute at any lime during a year, it includes cases of the 
disease arising before but extending into or through the year as well as those having 
their inception during the year. 

prevalence, lifetime The total number of persons known to have had the disease 
or attribute for at least part of their life. 

prevalence, period The total number of persons known to have had the disease 
or attribute at any time during a specified period. 

prevalence, point The number of persons with a disease or an attribute at a 
specified point in time. 

prevalence Rate (ratio) The total number of all individuals who have an attribute 
or disease at a particular time (or during a particular period) divided by the popu¬ 
lation at risk of having the attribute or disease at this point in lime or midway 
through the period. A problem may arise with calculating period prevalence rates 
because of the difficulty of defining the most appropriate denominator. See also 
prevalence. 

PREVALENCE STUDY See CROSS-SECTIONAL STUDY. 

preventable fraction (population) In a situation in which exposure to a given factor 
is believed to protect against a disease (or other outcome), the preventable fraction 
in the population is the proportion of the disease (in the population) that would be 
prevented if the whole population were exposed to the factor. This value must be 
interpreted with caution, as part or all of the apparent protective effect may be due 
lo other factors associated with the apparent protective factor. 

In a study of a total population, the preventable fraction (population) is com¬ 
puted as ip-l t , where t p is the incidence rate of the disease (or other outcome) 

/7 

in the population, and / r is the incidence rate in the exposed persons in the popu¬ 
lation. 

prevented FRACTION (population) In a situation in which exposure to a given factor is 
believed to protect against a disease (or other outcome), the prevented fraction is 
the proportion of the hypothetical total load of disease (in the population) that has 
been prevented by exposure to the factor. This value must be interpreted with 
caution, as part or all of the apparent protective effect may be due to other factors 
associated with the apparent protective factor. 

In a study of a total population the prevented fraction is computed as / y -/ p , 

where t p is the rate of the disease in the population, and / y is the rate among people 
unexposed to the factor. 

prevention The goals of medicine are to promote health, to preserve health, to restore 
health when it is impaired, and to minimize suffering and distress. These goals are 
embodied in the word “prevention," which is easiest to define in the context of 
levels, customarily called primary, secondary, and tertiary prevention. Authorities 
on preventive medicine do not agree on the precise boundaries between these 
levels, nor on how many levels can be distinguished, but the differences of opinion 
are semantic rather than substantive. 

An epidemiologic interpretation of the distinction between primary and second- 
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ary prevention is that primary prevention is aimed at reducing incidence of disease 
and other departures from good health, secondary prevention aims to reduce prev¬ 
alence by shortening the duration, and tertiary prevention is aimed at reducing 
complications. 

Primary prevention can be defined as the protection of health by personal and 
community-wide effects, e.g. f preserving good nutritional status, physical fitness, 
and emotional well-being, immunizing against infectious diseases, and making the 
environment safe. {But see also health promotion.) 

Secondary prevention can be defined as the measures available to individuals and 
populations for the early detection and prompt and effective intervention to correct 
departures from good health. 

Tertiary prevention consists of the measures available to reduce or eliminate long¬ 
term impairments and disabilities, minimize suffering caused by existing departures 
from good health, and to promote the patient's adjustment to irremediable condi¬ 
tions. This extends the concept of prevention into the field of rehabilitation. 

preventive medicine The application of preventive measures by clinical practitioners. 
A specialized field of medical practice composed of distinct disciplines that utilize 
skills focusing on the health of defined populations in order to promote and main¬ 
tain health and well-being and prevent disease, disability, and premature death. 

In addition to the knowledge of basic and clinical sciences and the skills common 
to all physicians, the distinctive aspects of preventive medicine include knowledge 
of and competence in biostatistics, epidemiology, administration including plant 
ning, organization, management, financing, and evaluation of health programs; en¬ 
vironmental health; application of social and behavioral factors in health and dis¬ 
ease; and the application of primary, secondary, and tertiary prevention measures 
within clinical medicine. (The above is the definition and description of the field 
that has been adopted by the American College of Preventive Medicine; for com¬ 
pleteness, at least two other items ought to be added, i.e. p health education and 
nutrition). 

primary case The individual who introduces the disease into the family or group un¬ 
der study. Not necessarily the first diagnosed case in a family or group. See also 
index case. 

PRIMARY HEALTH CARE 

1. Health care that begins at the time of first encounter between a patient and a 
provider of health care; An alternative term is primary medical care. 

2. The WHO definition of primary health care includes much more: Primary 
health care is essential health care made accessible at a cost the country and 
the community can afford, with methods that are practical, scientifically sound, 
and socially acceptable. Everyone in the community should have access to it, 
and everyone should be involved in it. Related sectors should also be involved 
in it in addition to the health sector. At the very least it should include edu¬ 
cation of the community on the health problems prevalent and on methods of 
preventing health problems from arising or of controlling them; the promo¬ 
tion of adequate supplies of food and of proper nutrition; sufficient safe water 
and basic sanitation; maternal and child health care including family planning; 
the prevention and control of locally endemic diseases; immunization against 
the main infectious diseases; appropriate treatment of common diseases and 
injuries; and the provision of essential drugs. (From Glossary of Terms Used in 
the Health for Alt Senes No. 1^8. Geneva: WHO, 1984.) 

principal component ANALYSIS A statistical method to simplify the description of a 


ZZfZISVZOZ 



105 program 

set of interrelated variables. lu general objectives are data reduction and interpre¬ 
tation; there is no separation into dependent and independent variables; the origi¬ 
nal set of correlated variables is transformed into a smaller set of uncorrelated 
variables called the principal components. Often used as the first step in a factor 
analysis. 

prior probability Probability calculated or estimated from theory or belief, before a 
study is done. See rayes’ theorem. 

PROBABILITY 

1. The limit of the relative frequency of an event in a sequence of N random 
trials as N approaches infinity, i.e., the limit of 

Number of occurrence of the event 
N — 

2. A measure, ranging from zero to 1, of the degree of belief in a hypothesis or 
statement. 

probability DENSITY The frequency distribution of a continuous random variable. 

probability DISTRIBUTION For a discrete random variable, the function that gives the 
probabilities that the variable equals each of a sequence of possible values. Exam¬ 
ples include the binomial and Poisson distributions. For a continuous random vari¬ 
able, often used synonymously with the probability density function. 

probability sample (Syn: random sample) See sample. 

probability theory The branch of mathematics dealing with the purely logical prop¬ 
erties of probability. Its theorems underly most statistical methods. 

proband See PROPOSITUS. 

PROBLEM-ORIENT ED medical record (pomr) A medical record in which the patient’s 
history, physical findings, laboratory results, etc., are organized to give a cumulative 
record of problems, e g., hemoptysis, rather than disease, e g., pneumonia. The 
record includes subjective, objective, and significant negative information, discus¬ 
sions and conclusions, and diagnostic and treatment plans with respect to each 
problem. The record, which was developed by Lawrence Weed, 1 contrasts with the 
traditional medical record, which is less formally organized, usually recording all 
information from each source (history, physical, and laboratory findings) together 
without regard to the problems the information describes. 

Since the problems may not be described in terms of conventional disease labels, 
their classification and counting for epidemiologic purposes are sometimes difficult. 
The international classification of health problems in primary care (icmppc) 
is an attempt to overcome this difficulty. 

‘Weed LL: Medical records that guide and leach. New Eng/ J Med 278:595-600, 652-657, 1968 

Procat arctic causc A term used by epidemiologists of the late 19ih and early 20th 
centuries, probably Iasi used by greenwood, to describe predisposing causes asso¬ 
ciated with habits of life. 

Professional AcnvrrY etudy (pas) The hospital discharge abstract system that 
covers many acuie short-slay hospitals in the United Slates, ll provides regularly 
published statistical tables arranged according to hospital service, diagnostic cate¬ 
gory, etc., giving details on diagnostic and therapeutic procedures, length of stay 
and outcome. 

PROGRAM 

1. A (formal) set of procedures to conduct an activity, e g., control of malaria. 

2. An ordered list of instructions directing a computer to carry out a desired 
sequence of operations. The objective is normally the solution of a problem. 
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PROGRAM EVALUATION AND REVttW TECHNIQUES (PERT) A Work-scheduling method that 
uses algorithms and also enunciates general principles of procedure for allocating 
resources. Calls for listing specific tasks to be completed and the resources-^person- 
nel. equipment, supplies, and other items^-that will be needed, along with their 
costs, a time chart indicating when each component task is to begin and end, giving 
interim accomplishment levels during that period, and a specification of times for 
interim review of the progress of the plan. 

program review An evaluative study of a specific health program operating in a spe¬ 
cific setting, performed to provide a basis for decisions concerning the operation of 
the program. 

program trial An experimental or quasi -ex peri mental evaluative study of a (health) 
program. 

pROLECTtVE Pertaining to data collected by planning in advance. Contrast retroactive. 
The terms protective and retrolective. coined by AR Feinstein 1 are said to describe 
more precisely the actions of research workers than the common terms prospective 
and retrospective; use of these terms is limited, and is deprecated by many epide¬ 
miologists. 

'Clin Pharmasol THrr 30:564-577. 1981. 

proportion A type of ratio in which the numerator is included in the denominator. 
The ratio of a part to the whole, expressed as a "decimal fraction" (e g.. 0.2), as a 
"vulgar fraction" ('/s) t or, loosely, as a percentage (20%). By definition, a proportion 
(p) must be in the range (decimal) 1.0. Since numerator and denominator 

have the same dimension, any dimensional contents cancel out, and a proportion is 
a dimensionless quantity. Where numerator and denominator arc based upon counts 
rather than upon measurements, the originals are also dimensionless, although it 
should be understood that proportions can be used for measured quantities, e g., 
the skin area of the lower limb is x percent of the total skin area, as well as for 
counts, e.g., 0.15 of the population died. A prevalence rate is a count-based pro¬ 
portion. The nondimensionality of a proportion, and its range limitations, do not 
necessarily apply to other kinds of ratios, of which "proportion" is a subset. See also 
rate; ratio. 

proportional haeards model (Syn: Cox model) A statistical model in survival 
analysis that asserts that the effect or the study factors on the hazard rate in the 
study population is multiplicative and does not change over lime. For example, the 
model for two factors x, and x, asserts that the rate at time t X ((), is given by 

X<,(0 

where M0 is the rate when x, *x,»0. and * is the (natural) exponential function. 

proportionate mortality rate, ratio (pmr) Number of deaths from a given cause 
in a specified time period, per 100 or 1000 total deaths in the same time period. 
Can give rise to misleading conclusions if used to compare mortality experience of 
populations with different distributions of causes of death. 

propositus (Syn: proband) The family member who first draws attention to a (genetic) 
pedigree of a given trait. The index case in a genetic study. 

prospective study See cohort study. 

protocol The plan, or set of steps, to be followed in a study or investigation, or in an 
intervention program. See also algorithm, clinical. 

proximate determinant or FERTILITY Factor having a direct influence on fertility; 
such factors include age at marriage, breastfeeding, abortion, and contraceptive 
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ftmuc health Public health is one of the efforts organized by society to protect, pro¬ 
mote, and restore the people's health. It is the combination of sciences, skills, and 
beliefs that is directed to the maintenance and improvement of the health of all the 
people through collective or social actions. The programs, services, and institutions 
involved emphasize the prevention of disease and the health needs of the popula¬ 
tion as a whole. Public health activities change with changing technology and social 
values, but the goals remain the same: to reduce the amount of disease, premature 
death, and disease-produced discomfort and disability in the population Public health 
is thus a social institution, a discipline, and a practice. 

punch card A card on which data are stored by means of holes punched in specified 
positions; useful in storing, processing, and analyzing data. Edge-punch cards have 
marginal holes converted to slots by punching so that they can be manually sorted. 
The commonly used variety of punch cards have 80 columns and 12 rows. In each 
column of the card there are 12 positions at which holes may be punched, accord¬ 
ing to a predetermined code. The position of the hole is the means of identifying 
the value of a variable. Punch cards or this type are sorted mechanically or electri¬ 
cally to provide a rapid means of processing and analyzing data, sometimes of great 
complexity. See also data processing. 

P value See P (probability). 
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qal* Acronym for quality-adjusted life years; this is an adjustment of life expectancy 
that allows for prevalence of activity-limitation, assessed from hospital discharge 
data or by health survey data, in the population subgroup for which QALY is cal¬ 
culated. For example, the life expectancy of males at birth in Canada in 1978 was 
70.8 years; after adjusting for activity-limitation using health survey dau, quality- 
adjusted life expectancy, or QALY. was 65.8 years. 1 

'Wilkins R. Adams O: HeoUhfutness of Ufc. Montreal, 1983. 

qualitative data Observations or information characterized by measurement on a 
categorical scale, i.e., a dichotomous or nominal scale, or, if the categories are or¬ 
dered. an ordinal scale. Examples are sex. hair color, death or survival, and nation¬ 
ality. See also measurement scale. 

quality control The supervision and control of all operations involved in a process, 
usually involving sampling and inspection, in order to delect and correct systematic 
or excessively random variations in quality. 

quality of care A level of performance or accomplishment that characterizes the health 
care provided. Ultimately, measures of the quality of care always depend upon 
value judgments, but there are ingredients and determinants of quality that can be 
measured objectively. These ingredients and determinants have been classified by 
Donabedian' into measures of structure (e.g., manpower, facilities), process (e g., 
diagnostic and therapeutic procedures), and outcome (e g., case fatality rates, dis¬ 
ability rates, and levels of patient satisfaction with the service). Sec also health 

SERVICES RESEARCH. 

1 Donabedian A: A Guide tv Mrdisal Cart Administration (Vol. 2). New York: American Publk Health 

Association. 1969. 

quality or urt In a general sens*, that which makes life worth living. In a more 
"quantitative" sense, an estimate of remaining life free of impairment, disability or 
handicap, as used in the expression "quality adjusted life years." somewhere be¬ 
tween these is an estimate of the utility of life—for instance, in clinical decision 
analysis, the utility of life that is impaired by a disabling degree of angina pectoris 
may be compared with that of a life that may be shorter in duration but free of 
disabling pain, as a result of applying therapeutic procedures. Such trade-offs are 
part of clinical decision analysis. See also utility. 

quantiles Divisions of a distribution into equal, ordered subgroups. Deciles are tenths; 
quartiles. quarters; quintiles, fifths; lerciles, thirds; and centiles. hundredths. 

quantitative data Data in numerical quantities such as continuous measurements or 
counts. 

quarantine The 14th edition of Control of Communicable Disease in Man* gives the fol¬ 
lowing: 

Restriction or the activities of well persons or animals who have been exposed to 
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a case of communicable disease during its period of communicability (i.e., contacts) 
to prevent disease transmission during the incubation period if infection should oc¬ 
cur. 

a) Absolute or complete quarantine: The limitation of freedom or movement of 
those exposed to a communicable disease for a period of time not longer than 
the longest usual incubation period of that disease, in such manner as to prevent 
effective contact with those not so exposed (see Isolation). 

b) Modified quarantine: A selective, partial limitation of freedom of movement of 
contacts, commonly on the basis of known or presumed differences in suscepti¬ 
bility and related to the danger of disease transmission. It may be designed to 
meet particular situations. Examples are exclusion of children from school, ex¬ 
emption of immune persons from provisions applicable to susceptible persons, or 
restriction of military populations to the post or to quarters. It includes: Personal 
surveillance, the practice of close medical or other supervision of contacts in or¬ 
der to permit prompt recognition of infection or illness but without restricting 
their movements; and Segregation, the separation of some part of a group of 
persons or domestk animals from the others for special consideration, control or 
observation—removal of susceptible children to homes of immune persons, or 
establishment of a sanitary boundary to protect uninfected from infected por¬ 
tions of a population. 

See also isolation. 

1 Washingion I.)G: American Publk Health Association, 1985. 

QUASI-EXPERIMENT An experiment in which the investigator lacks full control over the 
allocation and/or the liming or the intervention. 

questionnaire A predetermined set of questions used to collect data—clinical data, 
social status, occupational group, etc. This term is often applied to a selT-complcted 
survey instrument, as contrasted with an interview schedule. 

Queteltt, Lambert Adolphe Jacques (1796-1857) Belgian astronomer, statistician, 
and social scientist, one of the first to apply statistical thinking to the social and 
biological sciences, e.g., in delineating the (normal) distribution or variables such as 
height in the population. He influenced others who followed, e.g., Florence night¬ 
ingale. 

quetelet’s index See sody mass index. 

quota SAMPLING A method by whkh the proportions in the sample in various subgroups 
(according to criteria such as age, sex. and social status of the individuals to be 
selected) are chosen to agree with the corresponding proportions in the population 
The resulting sample may not be representative of characteristics that have not 
been taken into account. 

quotient The result of the division of a numerator by a denominator. 
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race Persons who are relatively homogeneous with respect to biological inheritance. 
See also ethnic group. 

radix The hypothetical sire of the birth cohort in a life table, commonly 1000 or 100,000. 

Ra he-Holmes social readjustment rating scale See life events. 

Ramazzini, Bernardino (1633-1714) An Italian physician, "Father of Occupational 
Medicine;" he published Dr Morbis Arttficum (On the Diseases of Workers) in 1700. 
Based on observation and anecdote, this was the first systematic account of diseases 
related to workplace exposures. 

random Governed by chance; not completely determined by other factors. As opposed 
to deterministic. 

random allocation See randomization. 

randomization Allocation of individuals to groups, e.g., for experimental and control 
regimens, by chance. Within the limits of chance variation, randomization should 
make the control and experimental groups similar ai the sun of an investigation 
and ensure that personal judgment and prejudices of the investigator do not influ* 
ence allocation. 

Randomization or random assignment should not be confused with haphazard 
assignment. Random assignment follows a predetermined plan that is usually de¬ 
vised with the aid of a table of random numbers. The pattern of assignment may 
appear lo be haphazard, bul this arises from the haphazard nature with which 
digits occur in a table of random numbers, and not from the haphazard whim of 
the investigator in allocating patients. 

randomized controlled trial (rct) An epidemiologic experiment in which subjects 
in a population are randomly allocated into groups, usually called “study" and "con¬ 
trol” groups, to receive or not to receive an experimenul preventive or therapeutic 
procedure, maneuver, or intervention. The results arc assessed by rigorous com¬ 
parison of rates of disease, death, recovery, or other appropriate outcome in the 
study and control groups, respectively. Randomized controlled trials are generally 
regarded as the most scientifically rigorous method of hypothesis testing available 
in epidemiology. A few authors refer to this method as "randomized control trial.” 
See also experimental epidemiology. 

random sample A sample that is arrived at by selecting sample units such that each 
possible unit has a fixed and determinate probability of selection. See also sample. 

Range or distribution The difference between the largest and smallest values in a 
distribution. 

ranking scale (Ordinal Scale) A scale that arrays the members of a group from high 
to low according to the magnitude of the observations, assigns numbers to the ranks, 
and neglects distances between members of the array. 
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rate A rate is a measure of the frequency of a phenomenon. In epidemiology, demog¬ 
raphy, and vital statistics, a rate is an expression of the frequency with which an 
event occurs in a defined population; the use or rates rather than raw numbers is 
essential for comparison of experience between populations at different limes, dif¬ 
ferent places, or among different classes of persons. 

The components of a rate are the numerator, the denominator, the specified time 
in which events occur, and usually a multiplier, a power of 10, which converts the 
rate from an awkward fraction or decimal to a whole number: 

Number of events in specified period _ 

Rate « —--— : . . . — : . , X 10 

Average population during the period 

All rates are ratios, calculated by dividing a numerator, e.g., die number of deaths, 
or newly occurring cases of a disease in a given period, by a denominator, e.g., the 
average population during that period. Some rales are proportions, i.e., the nu¬ 
merator is contained within the denominator. Rate has several different usages in 
epidemiology. 

1. As a synonym for ratio, it refers to proportions as rates, as in the terms cu¬ 
mulative incidence rate, prevalence rate, survival rate (cf. Webster'} Dictionary, 
which gives proportion and ratio as synonyms for rate). 

2. In other situations, rate refers only to ratios representing relative changes (ac¬ 
tual or potential) in two quantities. This accords with the OED, which gives 
"relative amount of variation" among its entries for rate. 

3. Sometimes rate is further restricted to refer only to ratios representing changes 
over time. In this usage, prevalence rate would not be a "true” rate because it 
cannot be expressed in relation to units of time but only to a “point” in time; 
in contrast, the force of mortality or force of morbidity (hazard rate) is a “true” 
rate for it can be expressed as the number of cases developing per unit time, 
divided by the total size of the population at risk. 

rate difference (rd) The absolute difference between two rates, for example, the 
difference in incidence rate between a population group exposed to a causal factor 
and a population group not exposed to the factor: 


RD *=/,-/„ 


where I, *= incidence rate among exposed, and incidence rate among unex¬ 
posed. In comparisons of exposed and unexposed groups, the term excess rate may 
be used as a synonym for rate difference. 
rate-odds ratio See odds ratio. 

rate ratio (rr) The ratio of two rates. The term is used in epidemiologic research 
with a precise meaning, i.e., the ratio of the rate in the exposed population to the 
rale in the unexposed population: 


where l w is the incidence rate among exposed, and /„ is the incidence rate among 
unexposed. See also relative risk. 

ratio The value obtained by dividing one quantity by another: a general term of which 
rate, proportion, percentage, etc., are subsets. The important difference between a 
proportion and a ratio is that the numerator of a proportion is included in the 
population defined by the denominator, whereas this is not necessarily so for a 
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ratio. A ratio is an expression or the relationship between a numerator and a de¬ 
nominator where the two usually are separate and distinct quantities, neither being 
included in the other. ® 

The dimensionality of a ratio is obtained through algebraic cancellation, sum* 
mat ion, etc., of the dimensionalities of its numerator and denominator terms. Both 
counted and measured values may be included in the numerator and in the denom¬ 
inator. There are no general restrictions on the dimensionalities or ranges of ratios, 
as there are in some of its subsets (e g., proportion, prevalence). Ratios are some¬ 
times expressed as percentages (e g., standardized mortality ratio, FEV X percent) 
In these cases, unlike the special case of a proportion, the value may exceed 100 
See also proportion; mu. 

RATIO SCALE See MEASUREMENT SCALE. 

RECEIVER operating CHARACTERISTIC (roc) curve (Svn: relative operating character- 
* sl ’ c curve) A graphic means for assessing the ability of a screening lest to discrim¬ 
inate between healthy and diseased persons. The lerm "receiver operating charac¬ 
teristic" comes Irom psychometry where the characteristic operating response of a 
receiver-individual to faim stimuli or nonstimuli was recorded. 

RECORD linkage A method for assembling the information contained in two or more 
records, e g., in different sets of medical charts, and in vital records such as birth 
and death certificates, and a procedure to ensure that the same individual is counted 
onlv once. This procedure incorporates a unique identifying system such as a per¬ 
sonal identification number and/or birth name(s) of the individual s mother. 

Record linkage makes it possible to relate significant health events that are re : 
mote from one another in time and place or to bring together records of different 
individuals, e g., members of a family. The resulting information is generally stored 
and retrieved by computer, which can be programmed to tabulate and analyze the 
data. 

"Lach person in the world creates a book of life. This book starts with birth and 
ends with death hs pages are made of the records of the principal events in life. 
Record linkage is the name given to the process of assembling the pages of this 
book into a volume.” l 

’Dunn ML: Record linkage. Am J Pub Health 36:1412, 1946 

recrudescence Reactivation of infection. 

Reed, W alter (1851-1902) US Army physician and epidemiologist. Responsible for 
epidemiologic investigations and experiments that established the transmission of 
yellow fever by a filterable virus carried by cuficine mosquitoes. The rigorous logic 
applied to both the experimental and incidental observations by Reed and his col¬ 
leagues is recognized as one of the great achievements of medical science. 

reference population The standard against which a population that is being studied 
can be compared. 

refinement The process of identifying new subcategories of study variables for the 
purfK>se of more accurate or more detailed description of relationships. An exam¬ 
ple is refinement of the concept of serum cholesterol level into high, low, and very 
low density lipoproteins. 

REGISTER, registry In epidemiology the term "register” is applied to the file of data 
concerning all cases of a particular disease or other health-relevant condition in a 
defined population such that the cases can be related to a population base. With 
this information incidence rates can be calculated. If the cases are regularly fol¬ 
lowed up, information on remission, exacerbation, prevalence, and survival can also 
be obtained. The register is the actual document, and the registry is the system of 
ongoing registration, 
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In most developed countries all births and deaths are recorded through birth 
and death registration systems. Results and summaries are then tabulated and pub¬ 
lished. Examples of registries that have epidemiologic value include the following: 

Cancer registries, which secure reports of cancer patients as soon as possible after 
first diagnosis. The principal sources for these reports are the hospitals serving the 
community, but a few cases are not reported until death. 

Tunn registries, which have provided the basis for studies attempting to differen¬ 
tiate genetic from environmental factors in the eliologv of cancer, and other con¬ 
ditions where both genetic and environmental factors mav be contributing causes. 

Birth defect registries . which seek to document anomalies that are apparent at or 
soon alter birth. Thcv surfer from incompleteness due to omission of stillbirths and 
of anomalies that do not declare their presence until later in life, such as certain 
forms of congenital heart lesion, mental deficiency, and neurological disorders. 

Other types of registers include blindness and other forms of physical handicap, 
high-risk infants, persons addicted to drugs, etc. Most of these, however, arc not 
truly population based, but merely list those persons known to or attending some 
agency or service that provides for them. 

registration The term ‘‘registration" implies something more than notification for the 
purpose of immediate action or to permit the counting of cases. A register requires 
that a permanent record be established, including identifying data. Cases mav be 
followed up. and statistical tabulations may be prepared froth on frequency and on 
survival. In addition, the persons listed on a register mav be subjects of special 
studies. 

REGRESSION 

1. As used by Francis galton, regression meant the tendency for offspring of 
exceptional parents (very tall, very intelligent, etc.) to possess characteristics 
closer to the average Tor the general population. (Hence, "regression to the 
mean") 

2. In statistics, regression is a synonym for regression analysis. 

regression analysis (itven data on a dependent variable \ and one or more indejren- 

dent variables x,. x 7 . etc. regression analysis involves finding the "best” mathematical 
model (within some restricted class of models) to describe v as a function of the x's, 
or to predict y from the x's. The most common form is a linear model; in epide¬ 
miology, the logistic and proportional hazards models are also common 

regression line Diagrammatic presentation of a regression equation, usuallv drawn 
with the independent variable, x. as the abscissa and the dependent variable, y . as 
ordinate. Three variables can be shown diagrammaiicallv on an isometric chart or 
stereogram. 

relationship See association. 

RELATIVE ODDS See ODDS RATIO. 

RELATIVE risk 

1. The ratio of the risk of disease or death among the exposed to the risk among 
the unexposed; this usage is synonymous with risk ratio. 

2. Alternatively, the ratio of the cumulative incidence rate in ihe exposed to the 
cumulative incidence raic in ihr unexposed, i.e., the cumulative incidence ra¬ 
tio. 

3. The lerm "relative risk" has also been used svnonvmously with "odds ratio" 
and. in some biostaiistical articles, has been used for the ratio of forces of 
morbidity. The use of the term "relative risk" for several different quantities 
arises from the fact that for "rare" diseases K g., most rancers) all ihe quan¬ 
tities approximate one another. For common occurrences (e g., neonatal nior* 
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lality in infants under 1500-g birth weight), the approximations do not hold. 
See also cumulative incidence ratio', odds ratio; rate ratio; risk ratio, 
reliability The degree of stability exhibited when a measurement is repeated under 
identical conditions. Reliability refers to the degree to which the results obtained by 
a measurement procedure can be replicated. Lack of reliability may arise from di¬ 
vergences between observers or instruments of measurement or instability of the 
attribute being measured. See also measurement, problems with terminology; 

OBSERVER VARIATION. 

repeatability (Syn: reproducibility) A test or measurement is repeatable if the results 
are identical or closely similar each lime it is conducted. See also measurement, 
PROBLEM5 WITH TERMINOLOGY! RELIABILITY. 

replacement level FERTILITY The level of fertility at which a cohort of women are 
having only enough daughters to replace themselves in the population. Bv defini¬ 
tion. replacement level fertility is equal to a net reproduction rate of LOO. The total 
fertility rate is also used as a measure of replacement level fertility; in the United 
Slates today, a lota) lertility rale of 2.12 is considered to be replacement level, it is 
higher than 2 because or mortality and because of a sex ratio greater than 1 at 
birth. The higher the mortality rate, the higher is replacement level fertility. 
replication The execution of an experiment or survey more than once so as to con¬ 
firm the findings, increase precision, and obtain a closer estimation of sampling 
error. Excut replication should be distinguished from consistency of result t on replication. 
hxact replication is often possible in the physical sciences, but in the biological and 
behavioral sciences, to which epidemiology Wongs, consistency of results on repli¬ 
cation is often the best that can be attained. Consistency of results on replication is 
perhaps the most important criterion in judgments of causality. 
representative sample The term "representative*' as it is commonly used is unde: 
fined in the statistical or mathematical sense; it means simply that the sample re¬ 
sembles the population in Mime wai. 

The use oT probability sampling will not ensure that any single sample will be 
"representative’' of the population in all possible respects If. for example, it is found 
that the sample age distribution is quite different from that of the population, it is 
possible to make corrections for the known differences. A common fallacy lies in 
the unwarranted assumption that, if the sample resembles the population closely 
on those factors that have been checked, it is "totally representative" and that no 
difference exists between the sample and the universe or reference population. 

Kendall and Buckland' comment as follows: "In the widest sense, a sample which 
is representative of a population. Some confusion arises according to whether ‘rep¬ 
resentative* is regarded as meaning selected by some process which gives all sam¬ 
ples an equal chance of appearing to represent the population*; or, alternatively, 
whether it means typical in respect of certain characteristics, however chosen'. On 
the whole, it seems best to confine the word ’representative' to samples which turn 
out to be so. however chosen, rather than apply it to those chosen with the object 
of being representative." 

1 Kendall MG. Buckland WR: A Duttonan of Statistical Term. 4th ed. London: Longman. 1982. 
reproducibility See repeatability. 

reproductive isolation Absence of interbreeding between populations. 
research desicn The procedures and methods, predetermined by an investigator, to 
be adhered to in conducting a research project. 

RESERVOIR OF INFECTION 

I. Any person, animal, arthropod, plant, soil, or substance, or a combination of 
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these, in which an infectious agent normally lives and multiplies, on which it 
depends primarily for survival, and where it reproduces itself in such a man¬ 
ner that it can be transmitted to a susceptible host. 

2. The natural habitat of the infectious agent. 

resolution (Syn: resolving power) A component of a measuring instrument that helps 
determine precision. The degree of refinement of the measuring process is com¬ 
monly referred in as the "resolution' or the "resolving power of the system." See 
also power The capability of distinguishing between things that are indeed sepa¬ 
rate or distinct from one another. 

resolving POWER The capacity of a system to distinguish between truly distinct things 
that are close together. 

response rate The number of completed or returned survey instruments (question¬ 
naires. interviews, etc.) divided by the total number of persons who would have 
been surveyed if all had participated. Usually expressed as a percentage. Nonres¬ 
ponse can have several causes, e.g., death, removal out of the survey community, 
and refusal. See also bias: completion rate; nonparticipants. 

retrolective Pertaining lo data gathered from medical records or other sources, when 
data collection look place without prior planning for the needs of an investigation. 
See also prolective: term in limited use. 

retrospective study A research design that is used to test etiologic hypotheses in 
which inferences alxml exposure to the putative causal factor(s) are derived f rom 
data relating to characteristics of the persons under study or to events or experi¬ 
ences in their past. The essential feature is that some of the persons under study 
have the disease or other outcome condition of interest, and their characteristics 
and past experiences are compared with those of other, unaffected persons. Per : 
sons who differ in the severity of the disease may: also l>e compared. There is dis¬ 
agreement among epidemiologists as to the desirability of using the term "retro¬ 
spective study” rather than "case control study" to descril* this method. See also 
CASE CONTROL STUDY. 

retrovirus This name is given to a familv of RNA viruses characterized bv the pres¬ 
ence of an enzvme. reverse transcriptase, that enables transcription of RNA to DNA 
inside an affected cell. Thus, retroviruses can make copies of themselves in host 
cells. The most important retrovirus is the human immunodeficiency virus (HIV); 
this makes copies of itself in host cells such as T4 "helper” lymphocytes and normal 
immune responses are disrupted. 

ri$r The probability that an event will occur, e g., that an individual will become ill or 
die within a stated period of time or age. Also, a nontechnical term encompassing 
a variety of measures of the probability of a (generally) unfavorable outcome See 
also PROBABILITY. 

RISK ASSESSMENT The qualitative or quantitative estimation of the likelihood of adverse 
effects that may result from exposure to specified health hazards or from the ab¬ 
sence of beneficial influences. 

RISK BENEFIT analysis The process of analyzing and comparing on a single scale ihe 
expected positive (benefns) and negative (risks, costs) results of an action, or lack of 
an action. 

risk benefit ratio The results of a risk benefit analysis, expressed as the ratio of risks 
to ficncfus. 

RISK difference (Syn: excess risk) The absolute difference between two risks. 

Risk factor An as|>ect of personal behavior or lifestyle, an environmental exposure, 
or an inborn or inherited characteristic, which on the basis of epidemiologic evi- 
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dence is known to be associated with health-related condition(s) considered impor¬ 
tant to prevent. The term “risk factor” is rather loosely used, with any of the fol¬ 
lowing meanings: 

1. An attribute or exposure that is associated with an increased probability of a 
specified outcome, such as the occurrence or a disease. Not necessarily a causal 
(actor. A risk marker. 

2. An attribute or exposure that increases the probability of occurrence of dis¬ 
ease or other specified outcome. A determinant. 

3. A determinant that can l>e modified by intervention, thereby reducing the 
probability of occurrence of disease or other specified outcomes. To avoid 
confusion, it may be referred to as a modifiable risk factor. 

RISK management The steps taken to alter, i.c., reduce, the levels of risk to which an 
individual or a population is subject. 

RISK MARKEK (Svn: risk indicator) An attribute that is associated with an increased 
probability of occurrence of a disease or other specified outcome and that can be 
used as an indicator of this increased risk. Not necessarily a causal factor. See also 
RISK FACTOR. 

risk ratio The ratio of two risks. 

robust A statistical test or procedure is said to be robust if it is not vers 1 sensitive to 
departures from the assumptions on which it is stricllv predicted (e g., that the data 
are normally distributed). 

Ross, Ronald (I8f»7-I932) Continued in India the work i>egun bv Laveran and Man- 
son on mosquitoes as vectors of infectious disease. In a series of experiments and 
microscopic dissections, he concluded that only the anopheles mosquitoes carried 
the malaria parasite and that a developmental stage of the parasite took place in 
the mosquito (On some peculiar pigmented cells found in two mosquitoes fed on 
malarial blood lint MrdJ 1780—1787. 1897). Awarded the Nobel prize for medicine 
in 1902. 

rubric Section or chapter heading. Used in epidemiology with reference to groups of 
diseases, e g., as in the international classification of disease (icd). 





safety factor A multiplicative factor incorporated in risk assessments or safety stan¬ 
dards to allow for unpredictable types of variation, such as variability from test 
animals to humans, random variation within an experiment, and person-to-person 
variability. Salety factors are often in the range of 10 to 1000. 

BArmr standards Under the requirements of the Occupational Safety and Health Act 
(OSHA, 1970), "occupational safety and health standard" means a standard that 
requires conditions, or the adoption of one or more practices, means, melfmds. 
operations, or processes reasonably necessary or appropriate to provide safe or 
healthful employment and places or emplovmenl. Safety standards may be adopted 
bv national consensus or established by federal regulation. These standards have 
been adopted in many other nations besides the United States, although some Eu- 
ropcan and other countries have their own standards, which may be lower or higher 
than those in the United Slates. 

There are several varieties of safety standards: 

1. OSHA-proroulgated, mainly lor carcinogens, also for cotton dust and lead. 
These are Permissable Exposure Limits (PELs). 

2. National Institute of Occupational Safety and Health (NIOSH) recommenda¬ 
tions, often lower limits, based on animal loxieily tests, empirical observations, 
epidemiologic investigations; these are Recommended Exposure Limits (RELs). 

3. An older-established set of criteria has been set by the American Conference 
of Governmental Industrial Hygienists; these are Threshhold Limit Values 
(TLVs) that have now replaced an earlier set or Maximum Allowable Concen¬ 
trations (MACs). 

sample A selected subset of a population. A sample may be random or nonrandom 
and may l»e representative or nonrepresemative. Several types of sample can be 
distinguished, including the following: 

Clustrr samplr: Each unit selected is a group of persons (all persons in a city block, 
a family, etc.) rather than an individual. 

Grab samplr (Svn: sample of convenience): These ill-defined terms describe sam¬ 
ples selected by easily employed but basically nonprobabilistic methods. M Man-in- 
the-street" surveys and a survey of blood pressure among volunteers who drop in 
at an examination booth in a public place are in this category. It is improper to 
generalize from the results of a survev based upon such a sample for there is no 
wav of knowing what sorts of bias may have been operating. See also bias. 

Probability (random) samplr: All individuals have a known chance of selection. They 
may all have an equal chance of being selected, or, if a stratified sampling method 
is used, the rale at which individuals from several subsets are sampled can be varied 
so as to produce greater representation of some classes than of others. 

A probability samplr is created by assigning an identity (lal»rl. numl>er) to all 
individuals in the "universe” population, e g., by arranging them in alphalxrtical 
order and numbering in sequence, or simply assigning a number to each, or by 
grouping according to area of residence and numbering the groups. The next step 
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is to select individuals (or groups) for study by a procedure such as use of a table 
of random numbers (or comparable procedure) to ensure that the chance of selec* 
lion is known. 

Stmple random sample: In this elementary kind of sample each person has an equal 
chance of being selected out of the entire population. One way or carrying out this 
procedure is to assign each person a number, starting with I. 2, 3, and so on. Then 
numbers are selected at random, preferably from a table of random numbers, until 
the desired sample size is attained. 

Stratified random ample: This involves dividing the population into distinct subgroups 
according to some important characteristic, such as age or socioeconomic status, 
and selecting a random sample out of each subgroup. If the proportion of the 
sample drawn from each of the subgroups, or strata, is the same as the proportion 
of the total population contained in each stratum (e g., age group 40-59 constitutes 
20£ of the population, and 20% of the sample comes from this age stratum), then 
all strata will be fairh represented with regard to numbers or persons in the sample 

Systematic sample: The procedure of selecting according to some simple, systematic 
rule, such as all persons whose names begin with specified alphal>etic letters, Ixim 
on certain dates, or located at specified points on a master list. A systematic sample 
max lead to errors that invalidate generalizations. For example, persons’ names 
more often begin with certain letters of the alphabet than with other letters, e g., q, 
x A systematic alphabetical sample is therefore likely to be biased. 

SAMPLE, EPSEM ( equal probability of selection method**) A sample selected in such a 
manner that all the population units have the same probability of selection. A sim¬ 
ple random sample is an Epsem sample; a stratified sample is not unless the prob¬ 
ability of selection is the same for all strata 

SAMPLINC The process of selecting a number of subjects from all the subjects in a par¬ 
ticular group or "unixerse.” Conclusions based on sample results max lie attributed 
only to the population sampled. Any extrapolation to a larger or different popula¬ 
tion is a judgment or a guess and is not part of statistical inference. 

SAMPLING ERROR See ERROR- 

SAMPLING variation Since the inclusion of individuals in a sample is determined by 
chance, the results of analysis in two or more samples will differ, purely hv chance. 
This is known as "sampling variation." 

SANITARY CORDON See CORDON SANITAIRE. 

scatter diagram (Svn: scattergram) A graphic method of displaying the distribution 
of two variables in relation to each other. The values for one variable are measured 
on the horizontal axis and the values for the other on the vertical axis. 

scenario building A method of predicting the future that relies on a series of as¬ 
sumptions about alternative possibilities, rather than on simple extrapolation of ex¬ 
isting trends. Trend lines for demographic composition, morbidity and mortality 
rates, etc., can then be modified by allowing for each assumption in turn, or conv 
binations of assumptions. The method is claimed to lead to greater flexibility in 
long-range health planning than simple forecasting that relies only upon extrapo¬ 
lation of trends. 

screening Screening was defined in 1951 by the US Commission on Chronic Illness as, 
The presumptive identification of unrecognized disease or defect by the applica¬ 
tion of tests, examinations or other procedures which can be applied rapidly. 
Screening tests sort out apparently well persons who probably have a disease Irom 
those who probably do not. A screening test is not intended to be diagnostic. Per¬ 
sons with positive or suspicious findings must be referred to their physicians for 
diagnosis and necessarx treatment.*' 
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Screening is an initial examination only, and positive responders require a seo 
ond, diagnostic examination. The initiative for screening usually comes from the 
investigator or the person or agency providing care rather than from a patient with 
a complaint. Screening is usually concerned with chronic illness and aims to detect 
disease not yet under medical care. 

There are different types of medical screening, each with its own aim; mass, 
multiple or multiphasic, and prescriptive. 

Mass screening simply means the screening of a whole population. 

Multiple or multiphasic screening involves the use of a variety of screening tests 
on the same occasion. 

Prescriptive screening has as its aim the early detection in presumptively healthy 
individuals or disease that can be controlled better iT detected early in its natural 
history. 

The characteristics of a screening test include accuracy, estimates of yield, preci¬ 
sion, reproducibility, sensitivity and specificity, and validity. See entries under these 
headings. 

screening level The normal limit or cutoff point at which a screening lest is regarded 
as positive. 

seasonal variation Change in physiological status or in disease occurrence that con¬ 
forms to a regular seasonal pattern. 

secondary attack rate The proportion of contacts who get a communicable disease 
as a consequence of contact with a case. The secondary attack rate is a measure 
of contagiousness and is useful in evaluating control measures See also attack 
rate. 

secular trend (Syn: temporal trend) Changes over a long period of time, generally 
vears or decades. Examples include the decline of tuberculosis mortality and the 
rise, followed by a decline, in coronary heart disease mortality in the United States 
and many other countries in the past 50 years. 

selection In genetics, the force that brings about changes in the frequency of alleles 
and genotypes in populations through differential reproduction. In epidemiology, 
the process and procedure for choosing individuals for stud), usuallx b) an orderly 
means such as random allocation. 

selection bias See bias. 

Semmelweis, Ignaz Philipp (1818-1865) An Austro-Hungarian physician-obstetrician, 
who discovered the cause of puerperal fever by carefully comparing infection rates 
in two wards of the Allgemeines Krankenhaus in Vienna. In one ward students cus¬ 
tomarily came direct Irom the mortuary or the dissecting room to the patients' 
bedside whereas in the other, they did not Puerperal infection death rates were 
much greater in the former. Semmelweis concluded that some morbid factor was 
thus transmitted to women in the worse-affected ward. Unhappily, his conclusions 
were rejected by his colleagues. 

sensitivity and specificity (of a screening test) Sensitivity is the proportion of truly 
diseased persons in the screened population who are identified as diseased by the 
screening test. Sensitivity is a measure of the probability of correctly diagnosing a 
case, or the probability that any given case will be identified by the test (Syn: true 
positive rate). 

Specificity is the proportion of truly nondiseased persons who are so identified by 
thr screening test. It is a measure of the probability of correctly identifying a non¬ 
diseased person with a screening test (Syn; true negative rate). The relationships 
are shown in the following fourfold table, in which the letters a. b , r, and d repre¬ 
sent the quantities specified below the table. 
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itivity testing 


creeping test results 

•osilive 

Megative 

Total 


True status 


Diseased 

(X 

c 

a + c 


Not diseased 
6 
H 

6 + rf 


Ji. Diseased individuals detected by the test (true positives) 
b. Noudiseased individuals positive by the lest (false positives) 

Jr. Diseased individuals not detectable by the test (Talse negatives) 
tt. Noudiseased individuals negative by the lest (true negatives) 


Sensitivity = Specificity = 

Predictive value (positive test result) = ^ ^ 


Predictive value (negative test result) = 


Total 


a + b 
c + H 

a + 6 + c + d 


See also Youden's test. 

jensiti vity testing A stutlv of how the final outcome ol an analysis changes as a func¬ 
tion of varying one or more of the input parameters in a prescribed manner. 

sentinel HEALTH EVENT A condition that can be used to assess the stability or change 
in health levels of a population, usually bv monitoring mortality statistics. Thus, 
death due to acute bead injury is a sentinel event lor a class of severe traffic injury 
that tnav be reduced by such preventive measures as use ol seatbelts and crash 
helmets. 

sentinel physician, sentinel paactice In familv medicine, a physician, practice, that 
undertakes to maintain surveillance for and report certain specific predetermined 
events, such as cases ol certain communicable diseases, adverse drug reactions. 

sequential analysis A statistical method that allows an experiment to he ended as 
soon as an answer of the desired precision is obtained. Study and control subjects 
arc randomly allocated in pairs or blocks. The result ol the comparison ol each pair 
ol subjects, one treated and one control, is examined as soon as it becomes available 
and is added to all previous results. 

serendipity The accidental (and happv) discovery of important new information. A 
well-known example is Fleming's discovery of the bacteriocidal properties of peni¬ 
cillin mould. In case-control studies aimed at testing a specific hypothesis, e g., about 
the relationship between tobacco and cancer, questions on other aspects of life-style 
have seretulipitouslv revealed statistically significant associations, e g., between al¬ 
cohol consumption and certain cancers. 

seroepidemiolocy F.pidemiologic studv or activity based on the detection on serologi- 
cal testing of characteristic change in the serum level ol specific antibodies. Latent, 
sulxlinical infections and carrier states can thus l>e detected, in addition to clinically 
overt cases. 

SEX ratio The ratio of one sex to the other. Usually defined as the ratio or males to 
females (or of the rates observed in males and females). 

“shoe-leather” epidemiology Gathering information for epidemiologic studies by 
direct inquiry among the people, e g., walking from door to door and asking ques¬ 
tions of every householder (wearing mu shoe leather in the process). John snow 
did this when investigating the sources of water supply to households in the cholera 
epidemic in London in 1854; the method has been successfully used in many sub* 


sequent epidemic investigations. It is especially useful in 
transmitted diseases. 

siblings Children borne by the same mother. 

sibship All the brothers and sisters borne by the same mothei 

sickness See disease. 

side effect An elfcct. other than the intended one, produo 
nostic, or therapeutic procedure or regimen. 

SICNAL-to-noise ratio A jargon term for the relationship of 
which is extraneous or irrelevant, or intrudes because n 
other procedures are insufficiently sensitive. 

significance Sec statistical significance. 

Simpson’s paradox A form of confounding, in which the pi 
variable changes the direction of an ass<x»ation. Simpsoi 
meta-analysis, because the sum of the data or results fro 
studies may be affected by confounding variables that ha 
sign features from some studies but not others; if this 
analvsis will be Hawed. Rothman' has pointed out that . l 
really a paradox but the logical consequence of failing to i 
confounding variables. 

'Rothman K): A pictorial representation nf confounding in epiden 

2N: 101-1 OH. 1975. 

simulation The use of a model system, e.g., a mathematical m 
to approximate the action of a real system, often used to 
real system. 

situation analysis Study of a situation that may require in 
with a definition of the problem, and an assessment or m 
severity, causes, and impacts upon the community, and is 
interactions Ixtween the system and its environment an 
mance. 

skew distribution An older and less recommended term I 
quency distribution. If a unitmKfal distribution has a lonf 
lower values of the variate, it is said to have negative skewt 
positive skewness. See also loc-nokmal distribution 
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Skew distribution of attack rate of measles in rela» 
From Lilienlejd and Lilienfeld. 1979 

SLOW virus Agent causing degenerative (neurological) diseasr 
incubation periixl and a prolonged, slowlv progressive cou 
firmed slow virus diseases are Grcutzfrldt-Jakob disease 
rosis is possibly a slow virus disease. Some cases of AIDS 
ease. 
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Snow, John (I8I3-I85H) London general practitioner and early anesthetist (he assisted 
Queen Victoria s delivery or two of her children with chloroform). His fame rests 
upon his observations, brilliant deductions, painstaking personal enquiries, and an. 
alvtic studies of cholera outbreaks in the mid-19th century in London and else- 
where. All are recorded in On (hr ^lodr of Communtcatioti of Cholera (London: Chur¬ 
chill. 2nd ed.. 1855). which can be regarded as the first dehnitive working text on 
epidemiology and which also contained an explicit statement of the germ theory of 
transmission, written 30 years before Koch discovered the cholera vibrio. See also 
NATURAL EXPERIMENT. 

social class A stratum in society composed of individuals and families of equal stand¬ 
ing See also socioeconomic classification. 

SOCIAL DRIFT Downward social class mobility as a result of impaired health often due 
to mental disorders 

SOCIAL medicine The practice of medicine concerned with health and disease as a func¬ 
tion of group living Social medicine is concerned with the health t>r people in 
relation to their behavior in social groups and as such involves care ol the individual 
patient as a member of a (amilv and ol other significant groups in evervday lile It 
is also concerned with the health of these groups as such and with that of the whole 
community as a community. See also community medicine: public health. 

socioeconomic classification Arrangement ol persons into groups according to such 
characteristics as prior education, occupation, and income, this usually reveals upon 
analysis a strong correlation with health-related characteristics such as average length 
of life and risk of dving from certain specific causes. 

I he oldest such classification that is epidemiologicallv usef ul is the Regislrar- 
Generals fRGs) occupational classification, developed in 1911 by Stephenson. 
^ c 8' 5trar T , eneral of Lngland and \\ales. This classified all occupations into five 
groups—the five 'social classes." Social class III is often further subdivided into 
nonmanual and manual groups: 

I Professional occupations 
1! Intermediate occupations 
IIIn Nonmanual skilled occupations 
111m Manual skilled occupations 
IV Partb skilled occupations 
V Unskilled occupations 

This has proven to be a valuable epidemiologic tool; social class is an accurate, 
consistent predictor of health experience. 

There have been several other attempts to develop a more refined classification; 
however, most refinements require collection of more detailed information. For 
example. Hollingshead s scale requires details about education and income as well 
as occupation, and so is more time-consuming, more likely to be incomplete, and 
requires more costly analysis than the RG s classification. In developing countries, 
where up to 90% of the population may be classified under "agriculturalist" or 
“pastoralist" (farming or herding), other types of classifications have been devel¬ 
oped 

One's prestige in society, and attitudes or values, e g., setting a high value on 
8 r,, ' n R a R 1 **! education, are generally an integral part of social class or socioeco¬ 
nomic status. Altitudes toward health are often part of the set of values and may 
explain part of the observed difference in health between social classes. 
socioeconomic status (ses) Descriptive term for a person s position in society, which 

TmTsesop: 
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may be expressed on an ordinal scale using such criteria as income, educational 
level attained, occupation, value of dwelling place, etc. 

SOFTWARE See COMPUTER. 

soundex code A sequence or letters used for recording names phonetically, especially 
in record linkage. 

source of infection The person, animal, object, or substance from which an infec¬ 
tious agent passes to a host. Source of infection should be clearly distinguished 
from source of contamination, such as overflow of a septic tank contaminating a 
water supply, or an infected cook contaminating a salad (See reservoir.)* 

'From Control of (ommuntrablr Durase tn Man. Mth ed. Washington DC: American Puhlir Health 

Association, 1985). 

spearman’s rank correlation See CORRELATION coefficient. 

SPECIFICATION 

1. The process of selecting a particular functional form or model for the rela: 
tionships ro In* analyzed in a study. 

2. The process of selecting variables for inclusion in the analysis oT an effect or 
association. This process leads to the identification of moderator variarixs 
and confounding variables. See also stratification. 

SPECIFICITT (OR A TF.ST) See SENSITIVITY AND SPECIFICITY. 

Spectrum of disease The full range of manifestations of a disease; a vague term, that 
can mean everything from mild or suliclinical or precursor states to fulminating, 
florid disease, or alternatively the natural history of a disease from onset to resolu¬ 
tion. 

spell OF sickness An episode of sickness with a well-defined onset and termination. 
As used in the monitoring or surveillance of disease, the spell is often defined by 
the duration of absence f rom work or school. 

spleen Kate A term used in malaria epidemiology, to define the frequency of enlarged 
spleens detected on survev of a population in which malaria is prevalent. In asso¬ 
ciation with the Hackett spleen classification it summarizes the severity of ma¬ 
laria endcmicitv. 

sporadic Occurring irregularly, haphazardly from time to time, and generally infre¬ 
quently. e g., cases of certain infectious diseases. 

SPOT map Map showing the geographic location of people with a specific attribute, e g., 
cases of a disease or cldcrlv persons living alone. The making of a spot map is a 
common procedure in the investigation of a localized outbreak or disease. Infer¬ 
ences from such a map depend on the assumption that the population at risk of 
developing the disease is fairly evenly distributed over the area, or that at least the 
heterogeneities are known and can be considered in interpreting the map. 

stable population A population that lias constant fertility and mortality rates. n< 
migration, and consequently a fixed age distribution and constant growth rate. Sci 
also stationary population. 

standard Something that serves as a basis for comparison; a technical specification o 
w'ritten report drawn up by experts based on the consolidated results of scienlifi 
study, technologs, and experience, aimed at optimum benefits and approved by 
recognized and representative body. 

standard deviation A measure of dispersion or variation. It is the most widely list 
measure of disjiersion of a frequency distribution. It is equal to the |K>sitive squa 
root or the variance. The mean tells where the values for a group are center* 
Hie standard deviation is a summary’ of how widely dispersed the values are arou 
this center. I 

standard error The standard deviation of an estimate. 1 


I 
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standardization A set of techniques used to remove as far as possible the effects of 
differences in age or other confounding variables, when comparing two or more 
populations. The common method uses weighted averaging of rates specific for 
age. sex, or some other potential confounding variahie(s) according to some speci- 
lied distribution of these variables. There are two main methods, as follows; 

Direct meOioA: The specific rates in a study population are averaged, using as weights 
the distribution of a specified standard population. The directly standardized rate 
represents what the crude rate would have been in the study population if that 
population had the same distribution as the standard population with respect to the 
variable(s) for which the adjustment or standardization was carried out. 

Indirect method: This is used to compare study populations for which the specific 
rates are either statistically unstable or unknown. The specific rates in the standard 
population are averaged, using as weights the distribution of the study population. 
The ratio of the crude rate lor the study population to the weighted average so 
obtained is the standardized mortality for morbidity) ratio, or SMR. The indirectly 
standardized rate itself is the product of the SMK and the crude rate lor the stan¬ 
dard population. 

standardized MORTALITY (morridity) ratio (smr) The ratio of the number of events 
observed in tbe study group or population to the number that would be expected 
if the study population had the same specific rates as the standard population, mul¬ 
tiplied bv 100. 

standardized rate ratio (SRR) A rate ratio in which the numerator and denominator 
rates have been standardized to the same (standard) population distribution. 
standard metropolitan statistical area Because of the extensive interactions be¬ 
tween a city and its surrounding areas, a unit encompassing (Kith is needed as a 
base for statistical description. The concept of a standard metropolitan statistical 
area (SMSA) was introduced in the United States to furnish such a unit. To qualify 
as an SMSA an area has to meet criteria related to size, social and economic inte¬ 
gration of the city and surrounding county or counties, minimum population den¬ 
sity. and minimum proportion of the labor force engaged in nonagricultural work. 
stationary population A stable population that has a zero growth rate with constant 
numliers of births and deaths each year. 

ETATirrics The science and art of collecting, summarizing, and analyzing data that are 
subject to random variation. The term is also applied to the data themselves and to 
summarization of the data. Statistical terms are defined by Kendall and Buckland.' 
'Kendall MG. Buckland WR; A Dictionary of Statistical Terms, 4lh ed. London; Longman. 198*2. 
statistical error See error, 
statistical inference See inference, 
statistical model See mathematical model. 

statistical significance Statistical methods allow an estimate to be made of the prob¬ 
ability of the observed or greater degree of association between independent and 
dependent variables under the null hypothesis. From this estimate, in a sample of 
given size, the statistical “significance" of a result can be stated. Usually the level of 
statistical significance is stated by the P value, 
statistical test A procedure that is intended to decide whether a hypothesis about 
the distribution of one or more populations or variables should be rejected or ac¬ 
cepted. Statistical tests may be parametric or non para me trie. 
stereocram (Syn: isometric chart) A graph or chart that displays more than two vari¬ 
ables in a manner that appears three-dimensional to the eye. 
stochastic process A process that incorporates some element of randomness. 
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rrRATECY In game theory, a mathematical function. 
stratification The process of or result of separating a sample into several subsamples 
according to specified criteria such as age groups, socioeconomic status, etc. 1 he 
effect of confounding variables may be controlled by stratifying the analysis of re¬ 
sults. For example, lung cancer is known to be associated with smoking. T o examine 
the possible association between urban atmospheric pollution and lung cancer, con¬ 
trolling for smoking, the population mav be divided into strata according to smok¬ 
ing status. The association between air pollution and cancer can then lie appraised 
separately within each stratum. Stratification is used not only to control lor con¬ 
founding effects but also as a way of detecting modifying effects. In this example, 
stratification makes it possible to examine the effect of smoking on the association 
between atmospheric pollution and lung cancer. 
stratified randomization (Svn; blocked randomization) A randomization procedure 
in which strata are identified and subjects randomly allocated within each. This 
produces a situation intermediate between paired allocation and simple random 
allocation. 

study design See research design, 
subcunical disease See disease, subclinical. 

surveillance Ongoing scrutiny, generallx using methods distinguished by their prac- 
ticabilitx. uniformity, and frequently their rapidity, rather than by complete accu¬ 
racy Its main purpose is to detect changes in trend or distribution in order to 
initiate investigative or control measures See also monitoring, 
surveillance or disease The continuing scrutiny o! all aspects of occurrence and 
spread of a disease that are pertinent to effective control. 

Included are the systematic collection and evaluation of (I) morbidity and mor : 
talitx reports. (2) special reports of field investigations of epidemics and of individ¬ 
ual cases, (3) isolation and identification of infectious agents by lalxiratories. (4) data 
concerning the availability, use, and untoward effects of vaccines and toxoids, im¬ 
mune globulins, insecticides, and other substances used in control. (5) information 
regarding immunity levels in segments of the population, and (6) other relevant 
epidemiologic data. A report summarizing these data should be prepared and dis¬ 
tributed to all cooperating persons and others with a need to know the results of 
the surveillance activities. The procedure applies to all jurisdictional levels of public 
health from local to international.' Serological surveillance identifies patterns of 
current and past infection using serological test. See also seroepidemiolocv. 
'Brnemon AS (Ed ) Control of Communicator Disraia in Man. I4lh ed Washington DC: American 
Public Health Associaiion. 1985. 

survey An investigation in which information is systematically collected but in which 
the experimental method is not used. A population survey mav be conducted by 
face-to face inquir). by self-completed questionnaires, by telephone, postal service, 
or in some other way. E^ch method has its advantages and disadvantages. For in¬ 
stance, a face-to-face (interview) survex may be a better way than self-completed 
questionnaire to collect information on attitudes or feelings, but it is more costlv. 
Existing medical or other records may contain accurate information, but not about 
a representative sample of the population. 

The information that is gathered in a survey is usually complex enough to re : 
quire editing (for accuracy, completeness, etc ), coding, keypunching, i.e.. entry on 
ruNCH cards and processing and analvsis by machine or computer. The generaliz- 
abilitv of results depends upon the extent to which the surveyed population is rep¬ 
resentative. 
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The term •'survey*' is sometimes used in a narrow sense to refer specifically to a 
field survey. 

survey instrument The interview schedule, questionnaire, medical examination re* 
cord form, etc., used in a survey. 

SURVIVAL analysis A class of statistical procedures for estimating the survival func¬ 
tion. and for making inferences about the effects on it of treatments, prognostic 
factors, exposures, and other covariates. 

SURVIVAL CURVE A curve that starts at 100% of the study population and shows the 
percentage of the population still surviving at successive limes for as long as infor¬ 
mation is available. Mav be applied not only to survival as such, but also to the 
persistence of freedom from a disease, or complication or some other endpoint. 
survival FUNCTION (Svn: survival distribution) A function of time, usually denoted bv 
5m. that starts with a population 100% well at a particular time and provides the 
percentage of the population still well at later times. Survival functions may be 
applied to any discrete event, for example, disease incidence or relapse, death, or 
recovers alter onset of disease (in which case the population is initially 100% dis¬ 
eased. and the “survival” function gives the percentage still diseased). 
survival rate (Svn: cumulative survival rale) The proportion of survivors in a group, 
e g., of patients, studied and followed over a period. The proportion of persons in 
a specified group alive at the beginning of the time interval (e g., a hve-vcar period) 
who survive to the end of the interval. It is equal to I minus the cumulative mor¬ 
tality rate. Mas be studied by currem or cohort life table methods. 
survival ratio The probability of surviving between one age and another; when com¬ 
puted for age groups, the ratios correspond to those of the person-years-lived funcr 
tion of a life table. 

survivors Kir study Use of a cohort life table to provide the probability that an event, 
such as death, will occur in successive intervals of time after diagnosis and. com 
verseh. the probabiliiv of surviving each interval. The multiplication of these prob¬ 
abilities of survival for each time interval for those alive at the beginning of that 
interval yields a cumulative probabiliiv of surviving for the total period of study. 
Sydenham, Thomas (162*1-1689) A great English physician in the tradition of Hippo¬ 
crates and one of the founding fathers of epidemiology (although his ideas about 
the meteorological causes of epidemics were wrong). His writings contain many 
careTuI and comprehensive accounts of important epidemic diseases, notably pla¬ 
gue. malaria, measles, dysentery, and scarlet fever. His Opera Omni have been twice 
translated into English; the second (and better) two-volume translation bv Latham 
was published bv the Sydenham Society in 1848-1850. 
symbiosis The biological association of two or more species to their mutual benefit. 
symmetrical relationship An association between variables that does not have direc¬ 
tion. 

The following four varieties can be distinguished: 

1. Functional interdependence, where one variable cannot exist without the other; 
C R - prevalence is a function of incidence and duration. 

2. Common complex, where variables occur together without being interdepen¬ 
dent or necessary to each other; e.g., the occurrence together of air pollution, 
poverty, poor housing, and overcrowding. 

3. Alternative indicators of the same entity; e.g., antibodies to a microorganism 
and history of specific infection caused by that microorganism. 

4. 1 he eflects of a common cause; e g., clinical and biochemical changes in hep¬ 
atitis. 

See also association, symmetrical. 
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syndrome A symptom complex in which the symptoms and/or signs coexist more fre¬ 
quently than would be expected by chance on the assumption of independence. 
synergism, synergy The definition of synergism in epidemiology is somewhat contro¬ 
versial. We offer two definitions, the first a common dictionary definition, the sec¬ 
ond a more specific definition encountered in bioassay. 

1. A situation in which the combined effect of two or more factors is greater 
than the sum of their solitary effects. 

2. Two factors act synergistically if there are persons who will get the disease 
when exposed to both factors but not when exposed to either alone, antaco: 
nism, the opposite of synergism, exists if there are persons who will get the 
disease when exposed to one of the factors alone, but not when exposed to 
both. Note that under these definitions two factors may act synergistically in 
some persons and antagonistically in others. 

systematic error See bias. 

systems analysis This term is used with three similar meanings; 

I The examination of various elements of a system with a view to ascertaining 
whether the proposed solution to a problem will fit into the system and. in 
turn, effect an overall improvement in the system. 

2. The analysis of an activity in order to determine precisely what is required of 
the system, how this can best be accomplished, and in what wavs the computer 
can be useful. 

3 Svstcms analysis refers to any formal analysis whose purpose is to suggest a 
course of action by systematically examining the objectives, costs, effectiveness 
and risks of alternative policies or strategies and designing additional ones if 
those examined are found wanting. It is an approach to or way of looking at 
complex problems of choice under uncertainty; it is not vet a method 
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Takahi, Kanehiro (1849* 1915) Japanese nobleman who studied medicine at St Tho¬ 
mas s Hospital Medical School, London. He became a naval surgeon, and later used 
his opportunity as director of naval medical services to conduct large-scale dietary 
experiments on populations of naval personnel, demonstrating that beriberi could 
be prevented by a mixed diet containing protein as well as rice. 
target population 

I collection of individuals, items, measurements, etc., about which we want 
to make inferences. The term is sometimes used to indicate the population 
from which a sample is drawn and sometimes to denote any "reference” pop¬ 
ulation about which inferences are required. 

2. The group of persons for whom an intervention is planned. 
taxonomy A systematic classification into related groups. 

taxonomy or DISEASE The orderly classification of diseases into appropriate categories 
on the basis of relationships among them, with the application of names. See also 
nosograpmy, nosology. 

f- distribution, /-test The /-distribution is the distribution of a quotient of indepen. 
dent random variables, the numerator of which is a standardized normal variate 
and the denominator or which is the positive square root of the quotient of a chi- 
square distributed variate and its number of degrees of freedom. The /-test uses a 
statistic that, under the null hypothesis, has the /-distribution, to lest whether two 
means differ significantly, or to test linear regression or correlation coefficients. 
The /-distribution and the Mesi were developed by WS Gossett, who wrote under 
the pseudonym Student as his employment precluded individual publication. 
TERATOGEN A substance that produces abnormalities in the embryo or fetus by disturb¬ 
ing maternal homeostasis or by acting directly on the fetus in utero. 

TEST or significance See P value; statistical SIGNIFICANCE. 

TEST HYrOTHESIS See NULL HYPOTHESIS. 

THEORETICAL epidemiolocy The development of mathematical/statistical models to ex¬ 
plain different aspects of the occurrence or a variety or diseases. With some infec¬ 
tious diseases, models have been generated to elucidate the reasons for epidemics 
and/or to predict the behavior of the disease in reaction to given control mea- 
sures.See also model 
therapeutig trial See clinical trial 
threshhold limit value See safety standards. 

threshold phenomena Events or changes that occur only after a certain level of a 
characteristic is reached. 
time cluster See clustering. 
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time-place cluster See clustering. 

total fertility hate (mi) The average number of children that would be born per 
woman if all women lived to the end of their childbearing years and bore children 
according to a given set of age-specific fertility rates. It is computed by summing 
the age-specific fertility rates for all ages and multiplying by the interval into which 
the ages are grouped. The TFR is an important fertility measure, providing the 
most accurate answer to the question, "How many children does a women have, on 
average?” 

tracer disease method Tracer or indicator conditions as defined by Ressner' are 
easily diagnosed, reasonably frequent illnesses or health states whose outcomes arc 
believed to be affected by health care and which taken in aggregate should reflect 
the gamut of patients and health problems encountered in a medical practice The 
extent to which the recorded care of these conditions concurs with preset standards 
of care is used as an index of the quality of care delivered. However, it should first 
be shown that the preset standards contribute to a favorable outcome. See also 
sentinel health event. 

1 Ressner DM . Snow CR. Singer J: of Medical Care for Children Washington DC: National 

Academv of Sciences. Institute of Medicine, 1974. 

transmission of infection Transmission of infectious agents. Any mechanism by which 
an infectious agent is spread through the environment or to another person. These 
mechanisms are defined in Control of C ommunicable Disease in Man* as follows; 

a. Direct transmission 

Direct and essentially immediate transfer of infectious agents (other than 
from an arthropod in which the organism has undergone essential multipli¬ 
cation or development) to a receptive portal of entry through which human 
infection may take place. This may be by direct contract as by touching, kiss: 
mg, or sexual intercourse, or by the direct projection (droplet spread) of drop¬ 
let spray onto the conjunctiva or onto the mucous membranes of the nose or 
mouth during sneezing, coughing, spitting, singing, or talking (usually limited 
to a distance of about 1 m or less). It may also be by direct exposure of sus¬ 
ceptible tissue to an agent in soil, compost, or decaying vegetable matter in 
which it normally leads a saprophytic existence, (e g., the systemic mycoses), 
or by the bite of a rabid animal. Transplacental transmission is another form 
or direct transmission. 

b. Indirect transmission 

Vehicle bom ^—Contaminated materials or objects (fomites) such as toys, 
handkerchiefs, soiled clothes, bedding, cooking or eating utensils, and surgical 
instruments or dressings (indirect contact); water, food, milk, biological prod¬ 
uct* including blood, serum, plasma, tissues, or organs; or any substance serv¬ 
ing as an intermediate means by which an infectious agent is transported and 
introduced into a susceptible host through a suitable portal of entry. The agent 
may or may not have multiplied or developed in or on the vehicle before 
being transmitted. 

Vector-borne —(I) Mechanical; Includes simple mechanical carriage by a crawl¬ 
ing or flying insect through soiling of its feet or proboscis, or by passage of 
organisms through its gastrointestinal tract. This does not require multiplica¬ 
tion or development of the organism. (2) Biological: Propagation (multiplica¬ 
tion), cyclic development, or a combination of these (cyclopropagative) is re¬ 
quired before the arthropod can transmit the infective form of the agent to 
man. An incubation period (extrinsic) is required following infection before 
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the arthropod becomes infective. The infectious agent may be passed vertically 
to succeeding generations (transovarian transmission); transstadia! transmis¬ 
sion is its passage from the one stage of the life cycle to another, as nymph to 
adult. Transmission may be by saliva during biting or by regurgitation or dep¬ 
osition on the skin of feces or other material capable of penetrating subse¬ 
quently through the bite wound or through an area of trauma from scratching 
or rubbing. This is transmission by an infected nonverlebratc host and must 
be differentiated for epidemiologic purposes from simple mechanical carriage 
by a vector in the role of a vehicle. An arthropod in either role is termed a 
“vector." 

Airborne —The dissemination of microbial aerosols to a suitable portal of en¬ 
try. usually the respiratory tract. Microbial aerosols are suspensions in the air 
or particles consisting partially or wholly of microorganisms. Particles in the 
I-5 /li range are easily drawn into the alveoli of the lungs and may be retained 
there; many are exhaled from the alveoli without deposition. They may re- 
main suspended in the air for long periods of time, some retaining and others 
losing infcctivitv or virulence. Not considered as airborne are droplets and 
other large particles that promptly settle out (see Direct transmission, above). 

The following are airborne and their mode of transmission is direct: 

Droplet nudet: Usually the small residues that result from evaporation of 
fluid from droplets emitted by an infected host (see above). Droplet nuclei also 
may be created purposely by a variety of atomizing devices, or accidentally as 
in microbiology laboratories or in abattoirs, rendering plants, or autopsy rooms. 
They usually remain suspended in the air for long periods of time. 

Dust: The small particles of widely varying size that may arise from soil (as, 
for example, fungus spores separated from dry soil by wind or mechanical 
agitation), clothes, bedding, or contaminated floors. 1 See also acquaintance 
network; air-borne infection; carrier; common vehicle spread; contact; 
contamination; droplet nuclei. 

1 Benenson AS (Ed.): Control of Communicable Diseases in Man, Mlh ed. Washington DC: American 

Public Health Association, 1985. 

TRANSOVA REAL TRANSMISSION See VECTOR-BORNE INFECTION. 

TRANSPORT HOST See PARATENIC HOST. 

trend A long-term movement in an ordered series, c.g., a time series. An essential 
feature is that the movement, while possibly irregular in the short term, shows 
movement consistently in the same direction over a long term. The term is also 
used loosely to refer to an association which is consistent in several samples or strata 
but is not statistically significant. 

trend line That line that best fits the distribution of a set of values plotted on two 
axes. 

TRIAL See CLINICAL TRIAL. 

trohoc stlidy A retrospective case-control study. The term, proposed by AR Fein- 
stein, 1 is the inversion of "cohort;” iu use is deprecated by the great majority of 
epidemiologists. 

'CUn Pharmacol Ther 30:564 r 577. 1981. 

TYPE I ERROR See ERROR. 

TYPE II ERROR See ERROR. 

twin rruov Method of detecting genetic etiology in human disease. The basic premise 
of twin studies is that monozygotic twins, being formed by the division of a single 
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fertilized ovum, carry identical genes, while dizygotic twins, being formed by the 
fertilization of two ova by two different spermatozoa, are genetically no more sim : 
ilar than two siblings born after separate pregnancies. 
two-tail test A statistical significance test based on the assumption that the data are 
distributed in both directions from some central value(s). 
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unbiassed estimator An estimator that for all sample sizes has an expected value equal 
to the parameter being estimated. If an estimator tends to be unbiassed as sample 
size increases, it is referred to as asymptotically unbiassed. 

UNDERLYING CAUSE OF DEATH See DEATH CERTIFICATE. 

underreporting Failure to identify and/or count all cases, leading to reduction of nu¬ 
merator in a rate. See also error. 

utility In economics, this means satisfaction derived from obtaining some quantity of 
a specified article of commerce. When used in decision theory or clinical decision 
analysis, (he meaning is essentially the same, and can be expressed as the useful¬ 
ness or desirability of an outcome resulting from a decision. 
vaccination Strictly speaking, vaccination refers to inoculation (from Latin in oculus, 
into a bud) with vaccinia virus against smallpox. Nowadays (he word is broadly used 
synonymously with procedures for immunization against all infectious disease. 
vaccine Immunobiological substance used for active immunization by introducing into 
the body a live modified, attenuated, or killed inactivated infectious organism or its 
toxin. The vaccine is capable of stimulating immune response by the host, who is 
thus rendered resistant to infection. The word "vaccine” was originally applied to 
the serum from a cow infected with vaccinia virus (cowpox; from Latin inuca, cow); 
it is now used of all immunizing agents. 
validation The process or establishing that a method is sound. 

validity This term, derived from the Latin validu j, strong, has several meanings, usu¬ 
ally accompanied by a qualifying word or phrase. 
validity, measurement An expression of the degree to which a measurement mea¬ 
sures what it purports to measure. 

Several varieties are distinguished, including construct validity, content validity, 
and criterion validity (concurrent and predictive validity). 

Construct validity: The extent to which the measurement corresponds to theoreti¬ 
cal concepts (constructs) concerning the phenomenon under study. For example, if 
on theoretical grounds, the phenomenon should change with age. a measurement 
with construct validity would reflect such a change. 

Content validity: The extent to which the measurement incorporates the domain 
of the phenomenon under study. For example, a measurement of functional health 
status should embrace activities of daily living, occupational, family, and social func¬ 
tioning, etc. 

Criterion validity: The extent to which the measurement correlates with an exter¬ 
nal criterion of the phenomenon under study. Two aspects of criterion validity can 
be distinguished: 

I. Concurrent mlidity: The measurement and the criterion refer to the same point 
in time. An example would be a visual inspection of a wound for evidence of 
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infection validated against bacteriological examination of a specimen taken at 
the same time. 

2. Predictive validity: The measurement's validity is expressed in terms of its abil¬ 
ity to predict the criterion. An example would be an academic aptitude test 
that was validated against subsequent academic performance. 

VALIDITY, study The degree to which the inference drawn from a study, especially 
generalizations extending beyond the study sample, arc warranted when account is 
taken of the study methods, the representativeness of the study sample, and the 
nature of the population from which it is drawn. Two varieties of study validity are 
distinguished: 

1. Internal validity: The index and comparison groups are selected and compared 
in such a manner that the observed differences between them on the depen¬ 
dent variables under study may. apart from sampling error, be attributed only 
to the hypothesized effect under investigation. 

2. External validity (gmeraltiability) A Study is externally valid or generalizahle if 
it can produce unbiased inferences regarding a target population (beyond the 
subjects in the study). This aspect of validity is only meaningful with regard 
to a specified external target population. For example, the results of a study 
conducted using only white male subjects might or might not be generalizahle 
to all human males (the target population consisting of all human males) It is 
not generalizahle to females (the target population consisting of all people). 
The evaluation of generalizability usually involves much more subject-matter 
judgment than internal validity. 

These epidemiologic definitions of the terms "internal validity" and "external va¬ 
lidity" do not correspond exactly to some definitions found in the sociological lit¬ 
erature. 

variable Any quantity that varies. Any attribute, phenomenon, or event that can have 
different values. 

variable, antecedent A variable that causally precedes the association or outcome 
under study. See also explanatory variable; independent variable. 

VARIABLE, CONFOUNDING See CONFOUNDING. 

variable, control Independent variable other than the "hypothetical causal variable” 
that has a potential effect on the dependent variable and is subject to control by 
analysis. 

VARIABLE, DEPENDENT See DEPENDENT VARIABLE. 

VARIABLE, DISTORTER A confoundinc variable that diminishes, masks, or reverses the 
association under study. 

VARIABLE, EXPERIENTIAL See INDEPENDENT VARIABLE. 

VARIABLE INDEPENDENT See INDEPENDENT VARIABIX. 

VARIABLE, INTERVENING See INTERVENING VARIABLE- 

VARIABLE, MANIFESTATIONAL See DEPENDENT VARIABLE. 

VARIABLE, MODERATOR See EFFECT MODIFIER. 

VARIABLE, PASSENCER See PASSENGER VARIABLE. 

variable, uncontrolled A (potentially) confounding variable that has not been brought 
under control by design or analysis. See also confounding. 

variance A measure of the variation shown by a set of observations, defined by the 
sum of the squares of deviations from the mean, divided by the number of degrees 
of freedom in the set of observations. 

VARIATE (Syn: random variable) A variable that may assume any of a set of values, each 
with a preassigned probability (known as its distribution). 
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VECTOR 

1. In infectious disease epidemiology, an insect or any living carrier that trans¬ 
ports an infectious agent from an infected individual or its wastes to a suscep¬ 
tible individual or its food or immediate surroundings. The organism may or 
may not pass through a developmental cycle within the vector. 

2. In statistics, an ordered set of numbers representing the values or a set of 
variables. 

vector-borne infection Several classes of vector-borne infections are recognized, each 
with epidemiologic features that arc determined by the interaction between the 
infectious agent and the human host, on the one hand, and the vector on the other. 
Therefore, environmental factors such as climatic and seasonal variations influence 
the epidemiologic pattern by virtue of their effects on the vector and its habits. 

The terms used to describe specific features of vector-borne infections are: 

Biological transmission: Transmission of the infectious agent to susceptible host by 
bite of blood-feeding (arthropod) vector as in malaria, or by other inoculation, as 
in Schistosoma infection. 

Extnnsic incubation period: Time necessary after acquisition of infection by the (ar¬ 
thropod) vector for the infectious agent to multiply or develop sufficiently so that 
it can be transmitted by the vector to a vertebrate host. 

Hibernation: A possible mechanism by which the infected vector survives adverse 
cold weather bv becoming dormant. 

Inapparent infection: Response to infection without developing overt signs of ill : 
ness. If this is accompanied by vtremia or bacteremia in a high proportion of in¬ 
fected animals or persons, the receptor species is well suited as an epidemiologically 
important host in the transmission cycle. 

Mechanical /roiumtutmt. Transport of the infectious agent between hosts by ar¬ 
thropod vectors with contaminated mouthparts, antennae, or limbs. There is no 
multiplication of the infectious agent in the vector. 

Overwintering: Persistence of the infectious microorganism in the vector for ex¬ 
tended periods, such as the cooler winter months, during which the vector has no 
opportunity to be reinfected or to infect a vertebrate host. Overwintering is an 
important concept in the epidemiology of vector-borne diseases since the annual 
recrudescence of viral activity after periods (winter, dry season) adverse to contin¬ 
ual transmission depends upon a mechanism for local survival of an infectious mi¬ 
croorganism or its reintroduction from outside the endemic area. To some extent, 
the risk of a summertime epidemic may be determined by the relative success of 
microorganism survival in the local winter reservoir. Since overwinter survival may 
in turn depend upon the level of activity of the microorganism during the preced¬ 
ing summer-fall, outbreaks sometimes occur for two or more successive years. 

Transovanat infection (transmission): Transmission of the infectious microorganism 
from the affected female arthropod to her progeny. 

vector space An area (or volume) defined by the specified dimensions of two (or 
three) vectors. 

vehicle of infection transmission The mode of transmission of an infectious agent 
from its reservoir to a susceptible host. This can be person-to-person, food, vector- 
borne, etc. 

Venn diagram A pictorial presentation of the extent to which two or more quantities 
or concepts are mutually inclusive and mutually exclusive. 

Virchow, Rudolf (1821-1902) Born in Pomerania, Virchow graduated in medicine 
from Berlin in 1843 and rapidly established his reputation as the leading medical 
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Hypothetical causal (Independent) 
variable, X 

Strength or association of dependent 
variable with hypothetical ciunl variable 
before introduction of third, control 
variable (proportion of virhnee accounted 
for by causal variable = A ) 

Overlap, in associations wKh dependent 
variable, of hypothetical causal variable 
and control varwble (~ O 

Dependent variable, Y 

Strength of association of dependent 
variable with control variable (proportion 
of variance accounted for by causal 
variable ~B) 

Control variable, Z 
Venn diagram. From Susser, 1973. 

scientist or his time. Modem pathology owes much to hb rigorous use of hypothesis- 
testing methods, illustrated in his first paper in the journal he founded. Archiv fiir 
palhologtsche Analomte, now universally known as Virchow s Archives. Virchow was 
also a practicing epidemiologist, who investigated a serious epidemic of typhus in 
Silesia in 1848; his recommendations for hygienic and social reform got him into 
trouble with the government, but his scientific brilliance made it impossible for the 
authorities not to recognize and reward him with promotions and honors. He en¬ 
tered Parliament in 1862. and during the Franco-Pmssian War he organized an 
ambulance service He made many contributions of fundamental importance to the 
science of pathology, but deserves to be remembered as a great humanitarian as 
well. 

virgin population A population that has never been exposed to a particular infectious 
agent. 

VIRULENCE The degree or pathogenicity; the disease-evoking power or a microorgan¬ 
ism in a given host. Numerically expressed as the ratio of the number of cases of 
overt infection in the total number infected, as determined by immunoassay. When 
death is the only criterion of severity, this is the case-fatality rate 
vital records (Literally, M To do with living”) Certificates of birth, death, marriage, 
and divorce required for legal and demographic purposes. 
vital statistics Systematically tabulated information concerning births, marriages, di¬ 
vorces, separations, and deaths based on registrations of these vital events. 



Source: https://www.industrydocuments.ucsf.edu/docs/kypxOOOO 




137 


zoonosis 


W, X, Y, Z 


washout phase Thai stage in a study, especially a therapeutic trial, when treatment is 
withdrawn so that its effects disappear and the subject's characteristics return to 
their baseline slate. 

worm count A method of surveillance of helminth infection of the gut that depends 
upon counts of worms, or their evsts or ova. in quantitatively titrated samples of 
feces. Other terms used to describe this form of surveillance are "egg count,” **cyst 
count,” and "parasite count.’* 

Wi), Ljen-Teh (I879-I9G0) Chinese epidemiologist, responsible lor controlling the plague 
pandemic in Manchuria in Ifll0— I I. Later he worked on control of sexually trans¬ 
mitted diseases and other socioeconomically determined health conditions. Hevel- 
oped a national quarantine service and was one of the founders of the Chinese 
Medical Association, thus helping to lay the foundations for health improvements 
in modern China. 

XENOBIOTIC 

1. (Svn: commensal, symbiosis) Pertaining to association of two animal species, 
usually insects, in the absence ol a dependency relationship, as opposed to 
parasitism. 

2. A foreign compound that is metabolized in the body. Manx pesticides and 
their derivatives, some food additives and a number of other complex organic 
conqxiunds such as dioxins and PCBs. are xenobiotics. 

xenodiAGN0515 Detection of a (human) pathogenic organism bv allowing a noninfected 
sector (e g., mosquito) lo consume infected material, and then examining this vec¬ 
tor for evidence of the pathogen. 

Yates’ correction An adjustment proposed by Yates (1934) in the chi-square calcu¬ 
lation for a 2x2 table, which brings the distribution based on discontinuous fre¬ 
quencies closer to the continuous chi-square distribution from which the published 
tables for testing chi-squares are derived 

years of potential life lost (vrix) See potential years of life lost. 

yield The number or proportion of cases of a condition accurately identified by a 
screening test. 

Youden’s index When assessing screening tests, in the uncommon case where the risk 
of a false negative and that of a false positive result are assumed lo be equivalent 
(i e., specificity and sensitivity assumed lo be equally important), it may be possible 
to compare screening lests through the Youden index based on the sum of specific¬ 
ity and sensitivity: 

Youden Index=/^ specificity + sensitive - 1 

with J ranging from zero (specificity = 0.5(1 and sensitivity = 0.50) to I (sensitive 
it) = LOO. specificity^ LOO). 

sefczTsczoz 


zero- time SHIFT This concerns the selection of a starting point for the measurement 
of survival following the detection of disease. It is a jargon term, denoting the 
movement “backward" (toward the starting point of a disease) of time between on¬ 
set and detection, that may accompany use of a screening procedure. 
zoonosis An infection or infectious disease transmissible under natural conditions from 
vertebrate animals to man. Examples include rabies and plague. May be enz«»otic 
or epizootic. 
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A Note to Readers 


JL HE rules of statistic s are the rules of good thinking, codi¬ 
fied. They apply to any kind of reporting in which numbers — 
stated or implied —are involved: political reporting, science re¬ 
porting, business, economics, sports, or whatever 

This guide is an attempt to explain the role, logic, and 
language of statistics, so we reporters can ask better questions 
about the many alleged facts or findings that rest, or should rest, 
on some credible numbers. Because this manual began as a 
project of the Harvard School of Public Health, the reporting of 
health and the environment is the major example. But the prin¬ 
ciples and many of the suggested “questions for reporters" can be 
used by inquiring reporters in any field. They can help you read 
a scientific report or listen to the conflicting claims of politicians, 
environmentalists, physicians, scientists, or almost anyone and 
weigh and explain them. And the final chapter specifically 
shows how these principles apply in all areas. 

Victor Cohn 
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Foreword 


Jt^EPORTERS play an essential role in communicating 
science to the public. In common with scientists, they desire 
accuracy, Although health and medicine provide many exciting 
stories, the biostatistics that scientists must use in their studies 
presents special problems for reporters. It gives uncommon and 
misleading meanings to common words like “significant,* “con¬ 
sistent,* and “power* Mathematical statistics often produces re¬ 
sults that are disturbingly counterintuitive, at least at first, to 
laymen and scientists alike. In vital statistics and epidemiology, 
definitions often seem arbitrary, and slight changes make con¬ 
siderable differences in the findings. 

Science writers often take short courses in special topics 
such as biostatistics. I have taught in some of these courses and 
have been impressed by the seriousness of the participants. Nev¬ 
ertheless, they need some of this material in an accessible and 
permanent form. 

Victor Cohn of the Washington Post has prepared this man¬ 
ual to help all reporters cut through these statistical tangles. He 
wants to give them a guide to the ways that statistics can clarify 
facts or mystify the reader. 

Cohn’s book grew out of the Media Project of our Health 
Science Policy Working Group of the Division of Health Policy 
Research and Education at Har/ard University. I am pleased 
that faculty members of the Harvard School of Public Health 
have been able to help him produce this book as a visiting fellbw 
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Y main mentor and guide in the preparation of this book 
has been Dr. Frederick MosteUer, Roger I. Lee professor emeri¬ 
tus of mathematical statistics and former chairman of the de¬ 
partments of Biostatistics and Health Policy and Management , 
Harvard School of Public Health. He gave so fully of his time, 
energy, and knowledge that he should be listed as coauthor but 
for the fact that I sometimes used a journalist’s freewheeling 
approach rather than a statistician’s rigor This makes any mis¬ 
statements mine. 

The project was supported by the Russell Sage Founda- 
tion^ and by the Council for the Advancement of Science Writ¬ 
ing, which pointed the way in holding seminars on statistics for 
journalists, including the first of its kind in 1964. 

I did much of the work as a visiting fellow at the Harvard 
School of Public Health, where Dr. Jay Winsten, director of the 
Center for Health Communication, was another indispensable 
guide, and Drs. John Bailar III, Nan Laird, Philip Lavin, 
Thomas A. Louis, and Marvin Zelen were valuable helpers. As 
were Drs. Gary D. Friedman and Thomas M. Vogt of the 
Kaiser organizations, Michael Greenberg of Rutgers University, 
and Peter Montague of Princeton University (on all of whose 
writings I leaned); Lewis Cope of the Minneapolis Star Tribune; 
Cass Peterson of the Washington Post; and my daughter, Deborah 
Runkie, no mean statistician. 

I also owe thanks to Harvard’s Drs. Peter Braun, Harvey 
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E journalists like to think we deal mainly in facts and 
ideas, but much of what we report is based on numbers. 

Politics comes down to votes. Budgets and dollars dominate 
government. The economy, business, employment, sports—all 
demand numbers. 

The environment, pollutants, toxic chemicals. Again, we 
see counts and measurements and, most likely, widely varying 
estimates, some careful, some questionably high or low. An 
environmentalist says a nuclear power plant or toxic waste 
dump will cause so many cases of cancer. An industry spokes¬ 
man denies it. What are their numbers? Where did they get 
them? How valid are they? 

A doctor reports a promising, even exciting new treatment. 
Is the claim justified or based on a biased or unrepresentative 
sample? Or too few patients to justify any claim? Science, medi^ 
cine, technology, the weather, intelligence—all are statistical. 
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CHAPTER 


FACTS , AND F1C 


Science is observation, experimentation, measurement, and aB 
these involve numbers, whether we reporters pay attention to 
them or not. 

Statistics are used or misused even by people who tell us, *1 
don’t believe in statistics,” then claim that all of us or most people 
or many do such and such. The question for reporters is, how 
should we not merely repeat such numbers, stated or implied, 
but also interpret them to deliver the best possible picrurc of 
reality? 

We can be better reporters if we understand how the best 
statisticians —the best figurers —figure. And if we learn a few 
questions to help us separate the wheat from the chaff. 

I do not say that telling the truth —describing reality —will 
then become easy, for we are constantly bombarded with sweep¬ 
ing claims in convincing wrappings, and the disputed subjects 
are endless. Medical and surgical treatments, radiation, pesti¬ 
cides, nuclear power, the probability of environmental disasters, 
the side effects of medicines — almost nothing seems settled. 

Like it or not, we must wade in . Whether we will it or not, 
we have in effect become part of the regulatory apparatus. Dr. 
Peter Montague of Princeton University tells us, ‘The environ¬ 
mental and toxic situation is so complex, we can’t possibly have 
enough officials to monitor it. Reporters help officials decide 
where to focus their activity” 

“Journalists opened up” the Love Canal toxic waste issue by 
“independent investigation” according to Cornell University’s 
Dr. Dorothy Nelkin. The extensive press coverage contributed 
to investigations that eventually forced the re-staffing of the En¬ 
vironmental Protection Agency’ and the creation of a national 
toxic waste disposal program.” 1 

That very coverage, however, may also have stampeded 
public officials into hasty, ill-conceived studies that left un¬ 
answered the crucial question: Did the Love Canal wastes ac¬ 
tually cause birth defects and other physical problems? 2 The 
very way we report a medical or environmental controversy can 
affect the outcome. If we ignore a bad situation, the public may 
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FACTS AND FIGURES: WE GAN DO BETTER 
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suffer. If we write “danger” the public may quake. If we write 
“no danger,” the public may be falsely reassured. If we paint an 
experimental medical treatment too brightly, the public is given 
false hope. 

It is not just what we write, it is what we emphasize. A 
National Cancer Institute survey indicated that many persons 
refuse to consider healthy changes in life-style because they 
think “carcinogens are everywhere in the environment" Such 
persons probably have read or heard again and again that most * 

cancers are environmentally related, although, in the opinion of J 

most informed scientists, most fatal “environmental” cancers are : 

v 

related^ mainly to individual behavior, outstandingly smoking, - 

and very possibly diet. By various estimates, perhaps 5 to 15 ? 

percent of all cancers are related to exposures to man-made ] 

carcinogens— chemicals we have inserted into the workplace, j 

foods, air, and water. 3 ; 

When it comes to such emotionally charged and complex \ 

issues, or when it simply comes to running for page one or 
making the six o’clock news, the best among us sometimes over- i\ 

state or understate. Philip Meyer, veteran reporter and author 
of Precision Journalism , writes, “Journalists who misinterpret 
statistical data usually tend to err in the direction of overim 
terpretation. . . . The reason for this professional bias is self- 
evident; you usually can’t write a snappy lead upholding [the 
negative]: A story purporting to show that apple pie makes you 
sterile is more interesting than one that says there is no evidence 
that apple pie changes your life." 4 

We also work fast, sometimes too fast, with severe limits on 
the space or time we may fill. We find it hard to tell editors or 
news directors, “I haven’t had enough time. I don’t have the 
story yet ” Even a long-term project or special may be hurriedly 
done. In a newsroom “long-term" may mean a few weeks. A 
major southern newspaper had to print a long, front-page re¬ 
traction after a series of front-page stories alleged that people 
who worked at or lived near a plutonium plant suffered in excess 
numbers from a blbod disease. “Our reporters obviously had 
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CHAPTER 


FACTS and fic 


confused statistics and scientific data* the editor admitted. “We 
did not ask enough questions.* 5 

We tend to oversimplify We may report, *A study showed 
that black is white* or “So-and-so announced that ...” when a 
study merely suggested that their was some evidence that such 
might be the case. We may slight or omit the fact that a scientist 
calls a result “preliminary* As scientific unsophisticates, we may 
confuse a study that merely suggests a hypothesis that should be 
investigated — very frequently the case—with a study that 
presents strong and conclusive evidence. 

We often omit essential perspective, context, or back¬ 
ground! Dr. Thomas Vogt of the Kaiser Permanente Center for 
Healthi Research tells of seeing the headline “Heart Attacks 
From Lack of and then, two months later, “People Who 
Take Vitamin C Increase Their Chances of a Heart Attack” 6 
Both stories were based on limited, and far from conclusive, 
animal studies. 

Scientists who do poor studies or overstate their results 
deserve part of the blame. But bad science is no excuse for bad 
journalism. We tend to rely most on “authorities* who are either 
most quotable or quickly available or both, and they often tend 
to be those who get most carried away with their sketchy and 
unconfirmed but “exciting* data —or have big axes to grind, 
however lofty their motives. The cautious, unbiased scientist 
who says, “Our results are inconclusive* or “We don’t have 
enough data yet to make any strong statement* or “I don’t know* 
tends to be omitted or buried someplace down in the story. 

We are influenced too by intense and growing competition 
to tell l the story first and tell it most dramatically I was once 
asked by a Harvard researcher^ “Does competition affect the way 
you present a story?* I thought and had to answer, “We have to 
almost overstate. We have to come as dose as we can within the 
boundaries of truth to a dramatic, compelling statement. A 
weak statement will go no place” Another reporter sakh “The 
fact is, you are going for the strong [lead and story]. And, while 
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The Certainty 
of Uncertainty 


Too much of the toence reporting in the press (blurs) what were sure of and 
what we’re not very sure of and what is inconclusive. The notion of tentative- 
ness tends to drop out of much reporting 

— Dr Harvey Brooks 



The only trouble with a sure thing is the uncertainty 

— Author unknown 


THE first thing to understand about science is that it is 
almost always uncertain. A scientist, seeking to explain or un¬ 
derstand something—be it the behavior of an atom or the effect 
of the toxic chemicals at a Love Canal —usually proposes a 
hypothesis, then seeks to test it by experiment or observation. If 
the evidence is strongly supportive, the hypothesis may then 
become a theory or at some point even a law, like the law of 
gravity. 

A theory may be so solid that it is generally accepted. 
Example: the theory that cigarette smoking causes lung cancer; 
for which almost any reasonable person would say the case has 
been proved, for all practical purposes. The phrase “for all prac¬ 
tical purposes” is important, for scientists, being practical peo¬ 
ple, must often speak at two levels: the strictly scientific level 
and the level of ordinary reason that we require for daily guid¬ 
ance. 

Example: In June 1985, 16 forensic experts examined the 
bones that were supposedly those of the “Angel of Death,” Dr. 
Josef Mengelb. Dr. Lowell Levine, delbgated by the Depart¬ 
ment of Justice, then said; ‘The skeleton is that of Josef 
8 
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Mengele within a reasonable scientific certaintyf and Dr. Mar¬ 
cos Segre of the University of Sao Paulo, explained, “We deal 
with the law of probabilities. We are scientists and not magi¬ 
cians* Pushed by reporters 1 questions — after all, this was an 
important matter, and what should the public believe? — several 
of the pathologists said they had “absolutely no doubt" of their 
findings. 1 (Later evidence made the case even stronger.) 

But' all any scientist can scientifically say —say with cer¬ 
tainty in almost any such case —is, there is a very strong proba¬ 
bility that such and such is true. 

Widely believed theories or conclusions are often proved 
wholly or partly wrong. “When it comes to almost anything we 
sayf reports Dr. Arnold Reiman, editor of the New England. 
Journal of Medicine) “you, the reporter, must realize —and must 
help the public understand—that we are almost always dealing 
with an element of uncertainty. Most scientific information is of 
a probable nature, and we are only talking about probabilities, 
not certainty. What we are concluding is the best we can do, our 
best opinion at the moment, and things may be updated in the 
future* 

Example: Until 1980 the American Cancer Society recom¬ 
mended that women have an annual Pap smear to detea cervi¬ 
cal cancer. The recommendation was then changed to every 
three years for many women, after two initial examinations. 
Statistics had shown that this would be equally efFeaive. 1 The 
matter is still controversial, and the recommendadon has been 
changed again in the light of new knowledge. 

Scientists are often wrong. In science this is not necessarily 
a failing. When new evidence disproves an old theory, or occa¬ 
sionally shows that some little believed, even kooky notion is 
right, the scientific method is doing what it should. It is work¬ 
ing. 

The public, and even some reporters and especially editors, 
have a hard time understanding these sometimes drastic revi¬ 
sions. We ail hear the question, Why do they say one thing 
today and another thing tomorrow? I was once on a radio talk 
show discussing unsettled medical controversies when a testy 
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THE CERTAINTY 


listener phoned in to exclaim, * They say* is a damned liar!* 

They* of course may be different theys who arrive at dif¬ 
ferent conclusions about inconclusive evidence in a thousand 
areas: the role of fats and cholesterol in the diet, the effects of 
low-level radioactivity, the cause of the extinction of dinosaurs. 

Why so much uncertainty? Science is always a continuing 
story. Nature is complex, and almost all methods of observation 
and experiment are imperfect. There are flaws in all studies," 
says Harvard’s Dr. Marvin 2-elen. 3 There may be weaknesses, 
often unavoidable ones, in the way a study is designed or con¬ 
ducted. Observers are subject to human bias and error. Subjects 
fluctuate. Measurements fluctuate. 

Many studies are thus inconclusive, and virtually no single 
study proves anything. Tundamentally" writes Dr. Thomas 
Vogt, “all scientific investigations require confirmation, and un¬ 
til it is forthcoming all results, no matter how sound they may 
seem, are preliminary^ 

Medicine, in particular, is full of disagreement and con¬ 
troversy. “No clinical trial is ever perfect" Harvard’s Dr. John 
Bailar observes. Unlike new drugs, medical treatments and tests 
and surgical operations need not even be subjected to experi¬ 
mental studies before being applied. “Most treatments escape 
and will: continue to escape rigorous evaluation," Bailar says. 5 

The reasons are many: lack of funds to mount enough 
trials; lack of enough patients at any one center to mount a 
meaningful trial; the expense and difficulty of doing multicenter 
trials; the swift evolution and obsolescence of medical tech¬ 
niques; the fact that, with the best of intentions, medical data — 
histories* physical examinations, interpretations of tests, descrip¬ 
tions of symptoms and diseases —are notoriously inexact and 
vary from physician to physician; and the serious ethical obsta¬ 
cles to trying a new procedure when an old one is doing some 
good, or to experimenting on children, pregnant women, or the 
mentally ill; 

While all studies have flaws, some have more flaws than 
others. Study after study has found that many articles in the 
most presugious medical journals are replete with shaky statis¬ 
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THE CERTAINTY OF UNCERTAINTY !1 

tics and lack of any explanation of such crucial matters as pa¬ 
tients' complications and the number of patients lost to follow¬ 
up. Papers presented at medical meetings, many of them widely 
reported by the media, are even less reliable. Many papers are 
mere progress reports on incomplete studies. Some state tenta¬ 
tive results that later collapse. Some are given to draw comment 
or criticism or get others interested in a provocative but still 
uncertain finding. 6 

The upshot, according to Dr. Gary Friedman of the Kaiser 
organizations Permanente Medical Group: “Much' of health 
care is based on tenuous evidence and incomplete knowledge. . 
. . Seemingly authoritative statements and accepted medical 
doctrines, perpetuated through'textbook and lectures, often turn 
out to be supported by the most meager of evidence, if any can 
be found" 7 

In general, possible risks tend to be underestimated: and 
possible benefits overestimated. For decades surgeons swore 
that only a radical mastectomy was the treatment for breast 
cancer. C>nJy recently were clinical trials mounted to show that 
less drastic treatments seem equally effective. Prefrontal lobot- 
omy, overstrict bed rest, drugs by the carload — medical history 
is rich in treatments that were given for years without question 
or statistically rigorous study, only to be proved wrong and 
discarded. 

Occasionally, unscrupulous investigators falsify their re¬ 
sults. More often, they may wittingly or unwittingly play down 
data that contradict their theories, or they may search out statis¬ 
tical methods that give them the results they want. Before 
ascribing fraud, says Harvard's Dr. Frederick Mosteller, “keep 
in mind the old saying that most institutions have enough in¬ 
competence to explain almost any results."* 

So some uncertainty almost always prevails. But uncer¬ 
tainty need not stand in the way of good sense. To live —to 
survive on this globe, to maintain our health, to set public 
policy, to govern ourselves—we almost always must act on the 
basis of incomplete or uncertain information. There is a way we 
can do so. 
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Somehow the wondrous promise of the earth is thai there are things beautiful in 
it, things wondrous and alluring, and by virtue of your trade, you want to 
understand them. 

— Mitchell Feigenbaum 
Cornell University physicist and melhemalicvxfi 

The great tragedy of Science— the slaying oi a beautiful hypothesis by an ugly 
fact. 

— Thomas Henry Huxiev 


A O reporters, the world is full of true believers, peddling 
their “truths." The sincerely misguided and the outright fakers 
are often highly convincing, also newsy How can we tell the 
facts, or the probable facts, from the chaff? 

We can borrow from science. We can try to judge all possi¬ 
ble claims of fact by the same methods and rules of evidence that 
scientists use to derive some reasonable guidance in i scores of 
unsettled issues. 

As a start, we can ask these questions: 

How do you know? 

Have the claims been subjected to any studies or experiments? 

Were the studies acceptable ones, by general agreement? For exam¬ 
ple: Were they without any substantial bias? 

Have results been fairly consistent from study to study ? 

Have the findings resulted in a consensus among others in the same 
field? Do at least the majority of informed persons agree* Or should we 
withho ld judgment until there is more evidence * 

Always: Are the conclusions backed by believable statistical evidence? 


THE SCIEMiflC ' 
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And what is the degree of certainty or uncertainty? How sure can you 
be? 


Obviously, much of statistics involves attitude or policy 
rather than numbers. And much, at least much of the statistics 
that reporters can most readily apply, is good sense. 

There are many definitions of statistics as a tool. A few 
useful ones: The science and art of gathering, analyzing, and 
interpreting data; a means of deciding whether an effect is real; 
a way of extracting information from a mass of raw data; a set 
of mathematical processes derived from probability theory. 

Statistics can be manipulated by charlatans, self-dcludcrs, 
and inexpert statisticians. Deciding on the truth of a matter can 
be difficult for the best statisticians, and sometimes no decision is 
possible. Uncertainty will ever rule in some situations and lurk 
in almost all. 

There are rare situations in which no statistics are needed. 
“Edison had it casyf says Dr. Robert Hooke, a statistician and 
author. “It doesn’t take statistics to see that a light has come on." 
It did not take statistics to tell 19th-century physicians that Mor¬ 
tons ether anesthesia permitted painless surgery or to tell' 20th- 
century physicians that the first antibiotics cured infections that 
until then had been highly fatal. 

Overwhekningiy, however, the use of statistics, based on 
probability, is called the soundest method of decision making, 
and the use of large numbers of cases, statistically analyzed, is 
called the only means for determining the unknown cause of 
many events. Birth control pills were tested on several hundred 
women, yet the pills had to be used for several years by millions 
before it became unequivocally clear that some women would 
develop heart attacks or strokes. The pills had to be used for 
some years more before it became clear that the greatest risk 
was to women who smoked and women over 35. 

The best statisticians, let alone practitioners on the firing 
line (for example, physicians), often have trouble deciding when 
a study is adequate or meaningful. Most of us cannot become 
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statisticians, but wc can at least learn that there are studies and 
studies, and the unadorned dairo “We made a study* or "We did 
an experiment" may not mean much. We can learn to ask more 
pointed questions if we understand some basic concepts and 
other facts about scientific studies. 

These are some bedrock statistical concepts: 

• Probability 

• “Power* and numbers 

• Bias and confounders 

• Variability 


Probability 


Scientists cope with uncertainty by measuring probabilities. 
Since all I experimental results and all events can be influenced 
by chance and almost nothing is 100 percent certain in science 
and medicine and life, probabilities sensibly describe what has 
happened and should happen in the future under similar condi¬ 
tions. Aristotle said, "The probable is what usually happens" but 
he might have added that the improbable happens more often 
than> most of us realize. 

The accepted numerical expression of probability in evalu¬ 
ating scientific and medical studies is the P (or probability) value. 
The P value is one of the most important figures a reporter 
should look for. It is determined by a statistical formula that 
takes into account the numbers of subjects or events being com¬ 
pared in order to answer the question, could a difference or 
result this great or greater have occurred by chance alone? By more 
precise definition, the P value expresses the probability that an 
observed relationship or effect or result could have seemed to 
occur by chance if there had actually been no real effect . A low P value 
means a low probability that this happened, that a medical 
treatment, for example, might have been declared beneficial 
when in truth it was not . 

Here is why the P value is used to evaluate results. A 
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scientific investigator first forms a hypothesis. Then he or she 
commonly sets out to try to disprove it by what is called the null 
hypothesis: that there is no effect, that nothing will happen. To 
back the original hypothesis, the results must reject the null hy¬ 
pothesis. The P value, then, is expressed either as an exact 
number or as <.05, say, or >.05, meaning “less than* or 
“greater than” a 5 percent probability that nothing has hap¬ 
pened, that the observed result could have happened just by 
chance—or, to use a more elegant statisticians phrase, by random 
variation. 

• By convention, a P value of .05 or less, meaning there are 
only 5 or fewer chances in 100 that the result could have hap¬ 
pened by chance, is most often regarded as low. This value is 
usually called statistically significant (though sometimes other val¬ 
ues are used)i The unadorned term “statistically significant* usu¬ 
ally implies that P is .05 or less. 

• A higher P value, one greater than . 05, is usually seen as not 
statistically significant. The higher the value, the more likely the 
result is due to chance. 

In common language, a low chance of chance alone calling 
the shots replaces the “it's certain* or “close to certain* of or¬ 
dinary logic. A strong chance that chance could have ruled 
replaces “it can’t be* or “almost certainly can’t be* 

Why the number .05 or less? Partly for standardization. 
People have agreed that this is a good cutoff point for most 
purposes. And partly out of old friend common sense. Frederick 
Mosteller tells us that if you toss a coin repeatedly in a college 
class and after each toss ask the class if there is anything suspi¬ 
cious going on, “hands suddenly go up all over the room" after 
the fifth head or tail in a row. There happens to be only 1 
chance in 16—.0625, not far from .05, or 5 chances in 100— 
that five heads or tails in a row will show up in five tosses, "so 
there is some empirical evidence that the rarity of events in the 
neighborhood of .05 begins to set people’s teeth on edge." 1 

Another common way of reporting probability is to calcu¬ 
late a confidence level, as well as a confidence interval (or confidence 
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limits or range). This is what happens when a political pollster 
reports that candidate X would now get 50 percent of the vote 
and thereby lead candidate V by 3 percentage points, “with a 3- 
percentage-point margin of error plus or minus and a 95 per¬ 
cent confidence level* In other words, Mr. or Ms. Pollster is 95 
percent confident that X’s share of the vote would be someplace 
between 53 and 47 percent. Similarly, candidate Y*s share might 
be 3 percentage points greater (or less) than the figure predicted. 
In a dose election, that margin of error could obviously turn a 
predicted defeat into victory And that sometimes happens. 

An important point in looking at the results of political polls 
(and any other statements of confidence); In the reports we 
read, the plus or minus 3 (or whatever) percentage points is 
often omitted, and the pollster merely mentions a *3-point 
margin of error* This means there is actually a 6-point range 
within which the truth probably lurks. 

The more people who are questioned in a political poll or 
the larger the number of subjects in a medical study, the greater 
the chance of a high confidence level and a narrow, and there¬ 
fore more reassuring, confidence interval. 

No matter how reassuring they sound, P values and confi¬ 
dence statements cannot be taken as gospel, for .05 is not a 
guarantee, just a number. There are several important reasons 
for this. 

• All that P values measure is the probability that the results 
might have been produced by some sneaky random process. In 
20 results where only chance is at work, 1, on the average, will 
have a reassuring-sounding but misleading P value of < .05. 
One, in short, may be a false positive. 

Dr. Marvin Zelen points out that there may be 6,000 to 
10,000 : clinical (medical) trials of cancer treatment under way 
today, and if the conventional value of .05 is adopted as the 
upper permissible limit for false positives, then every 100 studies 
with no actual benefit may, on average, produce 5 false-positive 
results. Hence, we may expect 50 false positive results, on 
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average, for every 1,000 trials with no beneficial effects! Zelen in 
fact has said, “We may now have reached an impasse in cancer 
chemotherapy in which there are large numbers of false-positive 
therapies in the clinic,* 3 leading physicians down many false 
paths. 

Amazingly, most false positives probably remain unde¬ 
tected. Scientists do not profit much professionally by reporting 
negative results, journal editors are not keen on publishing 
them. Nor are scientists keen on doing costly and time-consum- \ 

mg studies that merely confirm someone rise’s work, so “con- j 

firmatory studies are rare" Zelen reports. : 

• Statistical significance alone does not mean there is a : 

cause and effect. Correlation or association is not causation. Re- ^ 

member the rooster who thought his crowing made the sun rise? ' 

Unless an association is so powerful and so constantly repeated * 

that the case is overwhelming, association is only a clue, mean- ' 

ing more study or confirmation is needed. ; 

To statisticians, incidentally, there is this important dif- \\ 

fere nee between correlation and association: Association means l [ 

there is at least a possible relation between two variables. A 
correlation is a measure of the association. 

• If the number of subjects is too small, an unimpressive P 
value may simply mean that there were too few subjects to 
detea something that might have shown an effect in more sub¬ 
jects. Highly “significant” P values can sometimes adorn negiigi^ 

We differences in large samples. 

• An impressive P value might also be explained by some 
other variable or variables — other conditions or associations — 
not taken into account. 

• Statistical significance does not mean biological, clini¬ 
cal—that is, medical —or practical significance, though inexpe¬ 
rienced reporters sometimes see or hear the word “significant" 
and jump to that conclusion, even reporting that the scientists 
called their study “significant" Example: A tiny difference be¬ 
tween two large groups in mean hemoglobin concentration, or 
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red blood count (say, 0.1 g/100 mL, or a tenth of a gram per 
100 milliliters); may be statistically significant yet medically 
meaningless. 4 

• Eager scientists can consciously or unconsciously manip¬ 
ulate the value by failing to adjust for other factors, by choos¬ 
ing to compare different end points in a study (say, condition on 
Itaving the hospital rather than length of survival), or by choos¬ 
ing the way the P value is calculated or reported. 

There are several mathematical paths to a lvalue, such as 
the chi-square (x*), F> r, and paired t tests. All may be legiti¬ 
mate. But be warned: Dr. David Salsburg of Pfizer, Inc., has 
written in the American Statistician of the unscrupulous practi¬ 
tioner who “engages in a ritual known as ‘hunting for P values'" 
and finds ways to modify the original data to “produce a rich 
collection of small P values" even if those that result from simply 
comparing two treatments “never reach the magical .05."* 

Tf you look hard enough through your data” contributes 
an investigator at a major medical center, “if you do enough 
subset analyses, if you go through 20 subsets, you can find 
one”—say, “the effect of chemotherapy on premenopausal 
women with two to five lymph nodes”—“with a P value less than 
.05. And people do this.” 

“Statistical tests provide a basis for probability statements” 
writes Dr. John Bailar, “only when the hypothesis is fully devel¬ 
oped before the data are examined. ... If even the briefest 
glance at a study’s results moves the investigator to consider a 
hypothesis not formulated before the study was started, that 
glance destroys the probability value of the evidence at hand” 
(At the same time, Bailar adds, “review of data for unexpeaed 
clues . . . can be an immensely fruitful source of ideas” for new 
hypotheses “that can be tested in the correct way” And occa¬ 
sionally “findings may be so striking that independent confirma¬ 
tion ... is superfluous.”)* 

A rather sophisticated — and possibly touchy—line of ques¬ 
tioning that some reporters might want to try if they're skeptical: 
How did you arrive at your P value p Did you use the test planned in 
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advance in your protocol or study design, or did you apply several tests, then 
report the best-sounding one? 

And you may think of other questions. 

The laws of probability alio teach us to expect some unusual, 
even impossible-sounding events. 

We’ve all taken a trip to New \brk or London or someplace 
and bumped into someone from home. The chance of that? I 
don’t know, but if you and I tossed for a drink every day after 
work, the chance that I would ever win 10 times in a row is 1 in 
1,024. Yet I would probably do so sometime in a four- or five- 
year period. What I like to call the Law of Unusual Events — 
statisticians call it the Law of Small Probabilities — tells us that a 
few people with apparently fatal illnesses will inexplicably re¬ 
cover, there will be some amazing dusters of cases of cancer or 
birth defects that will have no common cause, and I may once 
in a great while bump into a friend far from home. 

In a large enough population such coinddences are not 
unusual. They are the rule. They produce striking anecdotes 
and often striking news stories. In the medical world they pro¬ 
duce unreliable, though often rited, testimonial or anecdotal 
evidence. ‘The world is large," Vogt notes, “and one can find a 
large number of people to whom the most bizarre events have 
occurred. They all have personal explanations. The vast major¬ 
ity arc wrong." 7 

“We [reporters] are overly susceptible to anecdotal evi¬ 
dence" Philip Meyer writes. “Anecdotes make good reading, 
and we art right to use them. . . . But we often forget to re¬ 
mind our readers —and ourselves—of the folly of generalizing 
from a few interesting cases. . . . The statistic' Is hard to re¬ 
member. The success stories are not."* 

A statistic to ask about is the denominator—the number of 
people or, a statistician would say, the population or domain —in 
whom such an event might happen. Zden dtes this example: 
The chance of any youngster between ages five and nine devel¬ 
oping leukemia is 3 in 100,000 per year. In a school with 100 
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children of this age group, we would expect only 3 cases in 100 
years. But in this nadon with thousands of schools, we would 
occasionally—such is chance —find schools with 3 or more cases 
in a single year. “Then one is faced with the problem of interpre- 
tauon,* Zelen says. “Is this one of those rare events that is surely 
going to be observed? Or is it due to some causal factor?* 

A reporter in this instance might ask a statistician at the 
Nadonal Cancer Institute or a medical center, What is the 
chance of such an event in such a population? How many 
similar unusual events are probably never reported? 

“Power” and Numbers 

This gets us to another statisucaJ concept: power. Statist^ 
cally, “power* means the probability of finding something if it’s 
there. Example: Given that there is a true effect, say a difference 
between two medical treatments or an increase in cancer caused 
by a toxin in a group of workers, how likely are we to find it? 

Sample size confers power. Statisticians say, Tunny things 
can happen in small samples without meaning very much* . . . 
‘There is no probability* until the sample size is there* . . . 
“Large numbers confer power* . . . “Large numbers at least 
make us sit up and take notice** 

All this concern about sample size can also be expressed as 
the law of large numbers which says that as the number of cases 
increases, the probable truth of a conclusion or forecast in¬ 
creases. The validity (truth or accuracy) and reliability (reproduci¬ 
bility) of the statistics begin to converge on the truth. 

We already learned this when we talked about probability. 


But bv thinkin 
both sample sb 
too affects the p 
if the number < 
shift from succc 
cally decrease ii 
If six padt 
rate, the shift 
success rate to 
any case that t 
valid or accur. 
not have relia 1 
samples. The 
no fatal biases 
would have ir 

One earn 

I have my 
claim, I * k a* 
finding tk 
example, some 
Would it seem 

Or if ther 
1G0 percent ir 
total and subtr 
changed, excej 
analysis. But 1 
times try thre 
problem or er 


•There is another unrelated use of the word *power* Scientists commonly speak of 
increasing or ‘raising'' some quantity by a power of 2 or 3 or 100 or whatever. "Power? 
here means the product you get when you multiply a number by itself one or more 
times Thus, in 2 X 2 b 4, 4 is the second power of 2, or to put it another way, there 
are rwo 2‘s in your equation. This is commonly written 2 3 and known as 2 to the second 
power or just 2 to the second In 2 X 2x2 = 8, 2 has been raised to the third power. 
When you think about 2'you see the need for the shorthand. 
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not have reliability until confirmed by careful studies in larger 
samples. The larger the sample, and assuming there have been 
no fatal biases or other flaws, the more confidence a statistician 
would have in the result. 



One canny science reporter, Lewis Cope, says, 

I have my own “rule of two." If someone makes some numerical 
claim, I look at the numbers, then see how much I might change the 
finding by adding or subtracting two from any of the figures. For 
example, someone says there are five cases of cancer in a community. 
Would it seem meaningful if there were three? 

Or if there were eight cases this year but four the year before—a 
100 percent increase —I ask myself, “If I add two cases to last year's 
total and subtract cwo from this year's, is there a chance things haven’t 
changed, except by chance?" This approach will never supplant refined 
analysis. But by playing around with the numbers this way —I some¬ 
times try three instead of two—a reporter can often spot a potential 
problem or error. 



A statistician says, *This can help with small numbers but 
not large ones* Mosteiler contributes “a little trick I use a lot on 
counts of any sixe* He explains, “Lets say some political unit 
has 10,000 crimes or deaths or accidents this year. Has some¬ 
thing new happened? The minimum standard deviation (see 
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page 33] for a number like that is 100—that is, the square root 
of the original number. That means the number may vary by a 
minimum of 200 every year without even considering growth^ 
the business cycle, or any other effect. This will supplement 
your reporter’s approach* 

Looking for error in reported results, statisticians try to spot 
both false positives and false negatives. The joke positive (or Type 
I or alpha error in statistical language you may see) is to find a 
result or effect where there is none. The false negative (or Type 11 
or beta error) is to miss an effect where there is one. The latter is 
particularly common when there are small numbers. There are 
some very well conducted studies with small numbers, even five 
patients, in which the results are so clear-cut that you don’t have 
to worry about power* says Dr. Reiman. “You still have to 
worry about applicability to a larger population, but you don’t 
have to doubt that there was an effect. When results are nega¬ 
tive, however, you have to ask, How large would the effect have 
to be to be discovered?" 

Many scientific and medical studies are underpowered — 
that is, they include too few cases. “Whenever you see a negative 
result," another scientist says, *you should ask, What is the 
power? What was the chance of finding the result if there was 
one?” One study found that an astonishing 70 percent of 71 
well-regarded clinical trials that reported no effect had too few 
patients to show a 25 percent difference in outcome. Half of the 
trials could not have detected a 50 percent difference.’ 

A statistician scanned an article on colon cancer in a lead¬ 
ing journal. “If you read the article carefully," he said, “you will 
see that if one treatment was better than the other —if it would 
increase median survival by 50 percent, from five to seven and a 
half years, say—they had only a 60 percent chance of finding it 
out. That’s little better than tossing a coin!" 

The weak power of that study would be expressed numeri¬ 
cally as .6; or GO percent. Scan an articles fine print or foot¬ 
notes, and you will sometimes find such a power statement. Most 
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authors still don’t report one, but the practice is growing, espe¬ 
cially when results are negative. 

How large is a large enough sample? One statistician calcu¬ 
lated that a trial has to have 50 patients before there is even a 30 
percent chance of finding a 50 percent difference in results. 

Sometimes large populations indeed are needed! 10 If some 
kind of cancer usually strikes 3 people per 2,000, and you sus¬ 
pect that the rate is quadrupled in people exposed to substance 
X, you would have to study 4,000 people for the observed 
excess rate to have a 95 percent chance of reaching statistical 
significance. The likelihood that a 30-to39-year-old woman will 
suffer a myocardial infarction, or heart attack, while taking an 
oral contraceptive is about 1 in 18,000 per year, lb be 95 per¬ 
cent sure of observing at least one such event in a one-year trial; 
you would have to observe nearly 54,000 women. 11 

Even the lack of an effect — statistically sometimes called a 
zero numerator—can be a trap. Say, someone reports, “We have 
treated 14 leukemic boys for five years with no resulting testicu¬ 
lar dysfunction’' —that is, zero abnormalities in 14. The question 
remains, how many cases would they have had to treat to have 
any real chance of seeing an effect? The probability of an effect 
may be small yet highly important to know about. 

All this means you must often ask, What's your denominator? 
What’s the size of your population?* A disease rate of 10 percent in 
20 individuals may not mean much. A 10 percent rate in 200 
persons would be more impressive. A rate is only a figure. 
Always try to get both the numerator and the denominator. 

The most important rule of all about any numbers: Ask for 
them. When anyone makes an assertion that should include 
numbers and fails to give them, when anyone says that most 
people, or even X percent, do such and such, you should ask, 

'And know that to a statistician a population does not necessarily mean a group of 
people. Statistically, a population is any group or collection of pertinent units—unit* with 
one or more pertinent characteristics in common — people, events, objects, records, test 
•cores, or physiological values (like blood pressure readings) Starisoaans also use the 
term uniatnt for a whole group of people or units under study 
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I told this story to one statistician, who said, T was once 
called about a person who had won first, second, and third 
prizes in a church lottery. I was asked to assess the probability 
that this could have happened. I found out that the winner had 
bought nearly all the tickets” 

He had of course asked the obvious question for both scien¬ 
tist and reporters: Could the relationship described be explained by other 
factors? 

Not everyone will tell you, of course, for bias is a pervasive 
human failing. As one candid scientist is said to have admitted, 
“I wouldn’t have seen it if I hadn’t believed it.* Enthusiastic 
investigators often tell us their findings are exciting. But they 
may be so exciting that the investigators paint the results in 
over-rosy hues. 

Other powerful human drives—the race for academic pro¬ 
motion and prestige, financial connections—can also create con¬ 
scious or unconscious conflicts of interest or attitudes that feed 
bias. Dr. Thomas Chalmers of Mount Sinai Medical Center in 
New York tells of a drug trial' financed by a pharmaceutical 
firm, in which both the head of the study committee and the 
main statisticians and analysts were the firms employees, 
thou gh not so identified in any credits. He tells of a study of oral 
drugs for diabetes in which the fact that the first author had 
previously published 14 articles on the subject, and in 7 had 
acknowledged support by the drug manufacturers, was “not 
known to the reader.” 

In contrast, Chalmers describes a study also financed by a 
drug firm but with a contract specifying a study protocol de¬ 
signed by independent investigators and monitored by an out¬ 
side board less likely to be influenced by a desire for a favorable 
outcome. It is never possible to eliminate” potential conflicts of 
interest in biomedical research, he concludes, but they should be 
disclosed so others can evaluate them. 13 

Even a genius may be biased^ Horace Freeland Judson of 
Johns Hopkins University tells How Isaac Newton experimented 
with prisms and lfcnses and developed a theory of color, light, 
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and the solar spectrum. He did not report seeing some dark 
lines—absorption lines, which marie varying wavelengths —that 
his instruments must have shown. A modem scientist argues 
that Newtons theory, not his instruments, had no place for that 
evidence: "To the observing scientist, hypothesis is both friend 
and: enemy? 14 

For years technicians making blood counts were guided by 
textbooks that told them two or more “proper!/* studied samples 
from the same blood should not vary beyond narrow “allowable" 
limits. Reported counts always stayed inside those limits. A 
Mayo Clinic statistician rechecked and found that at least two 
thirds of the time the discrepancies exceeded the supposed 
limits. The technicians had been seeing what they had been told 
to expect and discounting any differences as mistakes. This also 
saved them from the additional labor of doing still more count¬ 
ing. 

Both the biased observer and the biased subject are common in 
medicine. A researcher who wants to see a treatment result may 
see one. A patient may report one out of eagerness to please the 
researcher. There is also the powerful placebo effect. Summarizing 
many studies, one scientist found that half the patients with 
headaches or seasickness—and a third of those suffering from 
coughs, mood changes, anxiety, the common cold, and even the 
disabling chest pains of angina pectoris — reported relief from a 
“nothing pill* 15 A placebo is not truly a nothing pill; the mere 
expectation of relief seems to trigger important effects within the 
body. But in a careful study the placebo should not do as well as 
a test medication; otherwise the test medication is no better than 
a placebo. 

Sampling bias is the bugaboo of both political polls and medi¬ 
cal studies. Say you want to know what proportion of the popu¬ 
lace has heart disease, so you stand on a comer and ask people 
as they pass. Your sample is biased^ if only because it leaves out 
those too disabled to get around. Your problem, a statistician 
would say, is selection . A political pollster who fails to build a valid 
probability sample, easy when questioning only a thousand or 
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so people from coast to coast, has equally poor selection. 14 

A doctor in a clinic or hospital with an unrepresentative 
patient population —healthier or sicker or richer or poorer than 
average —may report results that do not represent the popula¬ 
tion as a whole. Veterans Administration hospitals, for example, 
treat relatively few women; their conclusions may apply only to 
the disproportionate number of lower-income men who typi¬ 
cally seek out the VA hospitals* free care. A celebrated Mayo or 
Cleveland or Ochsner clinic sees both a disproportionate nurm 
ber of difficult cases and a disproportionate number of patients 
affluent and well enough to travel. The famed Kinsey reports 
were valuable revelations of sexual behavior but flawed because 
the samples consisted disproportionately of upper middle-class 
men and women and of those willing to talk. 

An investigator may also introduce bias by constraining, or 
distorting, a sample —by failing to reveal nonresponse or by 
otherwise “throwing away data.” A surgeon cites his success rate 
in those discharged from the hospital after an operarion but 
omits those who died during or just after the procedure. Many 
people drop out of studies—sometimes they just quit—or they 
are dropped for various reasons: They could not be evaluated, 
they came down with some “irrelevant" disorders, they moved 
away, they died. In fact, many of those not counted may have 
had unfavorable outcomes had they stayed in the study. 

Mosteller tells of a nationwide study of a possibly danger¬ 
ous anesthetic. The investigators relied on autopsy results at 38 
hospitals. Unfortunately, only about 60 percent of the relevant 
dead had been autopsied, and “anything could have been ex¬ 
plained by the missing 40 percent, so that pan of the study 
wound up with a handful of nothing" 

The presence of significant nonresponse can often be de¬ 
tected, when reading medical papers, by counting the number 
of patients treated versus the number of untreated or differently 
treated controls —patients with whom the treated pauents are 
compared. If the number of controls is strikingly greater in a 
randomized clinical trial (though not necessarily in an epidemio- 
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logical or environmental study), there were probably many 
dropouts. A weUKonducted study should describe and account 
for them. A study that does not may report a favorable treat¬ 
ment result by ignoring the fate of the dropouts—a confounding 
variable. 

Age, gender, occupation, nationality, race, income, so¬ 
cioeconomic status, health status, and powerful behaviors like 
smoking are all possible confounding —and frequently 
nored—variables. In the 1970s, foes of adding fluoride to dry 
water pointed to crude cancer mortality rates in two groups of 
10 U.S. dues. One group had added fluoride to water, the other 
had not, and from 1950 to 1970 the cancer mortality rate rose 
faster in the fluoridated does. The National Cancer Institute 
pointed out that the two groups were not equal: The difference 
in cancer deaths was almost entirely explained by differences in 
age, race, and sex. The age-, race-, and sex-adjusted difference 
actually showed a small, unexplained lower mortality rate in the 
fluoridated cities. 17 

If you look carefully at the fate of women taking birth 
control pills, you find that advancing age and smoking are the 
two great confounders You must take both into account to find 
the greatest dusters of ill effects. Smoking has been an important 
confounder in studies of industrial contaminants like asbestos, 
in which^ again, the smokers suffer a disproportionate number 
of ill effects. 1 * 

A 1947 survey of Chicago lawyers showed that those who 
had mere high school diplomas before entering legal training 
earned 6.3 percent more, on the average, than college gradu¬ 
ates. The confounder here —the real explanation —was age. In 
1947 there were still many older lawyers without college de¬ 
grees, and they were simply older, on the average, and hence 
more established. 1 * 

Occupational studies often confront another seeming para¬ 
dox: The workers exposed to some possible adverse effect turn 
out to be healthier than a control group of persons without such 
exposure. The confounder: the well-known healthy-worker effect. 
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may be that children are congregated in school, giving colds to 
each other, thence to their families, thence to their families 
coworkers, thence to the coworkers’ families, and so on. But 
cold weather — and home heating? — may still figure, perhaps by 
drying nasal passages and making them more vulnerable to 
viruses. 

The search for true variables is obviously one of the main 
pursuits of the epidemiologist, or disease detective—or of any 
physician who wants to know what has affected a patient, or of 
any student of society who seeks true causes. Like colds, many 
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medical conditions, such as heart disease, cancer, and probably 
mental illness, have multiple contributing factors. Where many 
known, measurable factors are involved, statisticians can use 
mathematical techniques —the terms you will see include multiple 
regression, multivariate analysis, and discriminant analysis and factor, 
cluster ; path, and two-stage least-squares analyses— to relate all the 
variables and try to find which are the truly important predic¬ 
tors. Yet some situations, like the striking decline in U.S. heart 
diseas e mortality in recent years, defy such analyses. These 
years have seen several major changes in American life that 
may play a role: less smoking among men, consumption of a 
leaner diet, more recreational exercise (though more sedentary 
work). Medical care is far better, including the treatment of 
hypertension, which disposes people to heart disease. Many of 
these variables cannot be well measured, and the effect of some 
is debatable, so—a common situation in science —the truth re¬ 
mains uncertain. 

Variability 

Doctors always say, “Most things are better in the morning," 
and they’re mostly right. Most chronic or recurring conditions 
wax and wane. We tend to wake up at night when the condition 
is at its worst. Then, no matter what is done by way of treat¬ 
ment the next day, the odds are that well fed better. 

This is regression toward the mean: the tendency of all values in 
every fidd of sdence—physical, biological, social^ and eco¬ 
nomic—to move toward the average. Tall fathers tend to have 
shorter sons, and short fathers, taller sons. The students who get 
the highest grades on an exam tend to get somewhat lower ones 
the next time. The regression effect is common to all repeated 
measurements. 

Regression is part of an even more basic phenomenon; 
variation, or variability: Virtually everything that is measured var¬ 
ies from measurement to measurement. When repeated, every 
experiment has at least slightly different results. Take a patients 
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Wood pressure, pulse rate, or blood count several times in a 
row, and the readings will be somewhat different. Take them at 
different times of day or on different days, and the readings may 
vary greatly. 

The important reasons? In part, fluctuating physiology, but 
also measurement errors, the limits of measurement accuracy, 
and observer variation. Examining the same patient, no two 
doctors will report exactly the same results, and the results may 
be grossly different. If six doctors examine a patient with a faint l 

heart murmer, only one or two may have the skill or keen I 

hearing to detect it. Experimental results so typically differ from : 

one time to the next that scientific and medical fakers —a Boston j 

cancer researcher, for example—have been detected by the un- f 

usual regularity of their reported results, with numbers agreeing £ 

too well and the same results appearing time after time, with not \ 

enough variation from patient to patient. ? 

Biological variation is the most important cause of variation in } 

physiology and medicine. Different patients, and the same pa- l > 

dents, react differently to the same treatment. Disease rates 
differ in different parts of the country and among different popu¬ 
lations, and—alas, nothing is simple —there is natural variation 
within the same population. 

Every population, after all, is a collection of individuals, 
each with many characteristics. Each characteristic, or variable, 
such as height, has a distribution of values from person to person, 
and—if we would know something about the whole popula¬ 
tion—we must have some handy summaries of the distribution. 

We can’t get much out of a list of 10,000 measurements, so we 
need single values that summarize many measurements. 

Enter here the familiar average or, more exactly, the mean, 
median, and mode . These and a few other measures can give us 
some idea of the look of the whole and its many measurable 
properties, or parameters. 

When most of us speak of an average, we mean simply the 
mean or arithmetic average, the sum of all the values divided by the 
number of values. The mean is no mean tool; it is a good way 
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to get a typical number, but it has limitations, especially when 
there are some extreme values. There is said to be a memorial 
in a Siberian town to a fictitious Count Smerdlovski, the worlds 
champion at Russian roulene. On the average he won, but his 
actual record was 73 and l. 21 

If you look at the average salary in a hospital, you will not 
know that half the personnel may be working for the minimum 
wage, while a few hundred persons make $100,000 or more a 
year. You may learn more here from the median, the figure that 
divides a population into two equal halves. The median can be 
of value when a group has a few members with extreme values, 
like the 400-pounder at an obesity clinic whose other patients 
weigh from 180 to 200 pounds. If he leaves, the patients' mean 
weight might drop by 10 pounds, but the median might drop 
just 1 pound. 22 

The most frequently occurring number or valbe in a distri¬ 
bution is called the mode. When the median and the mode are 
about the same, or even more when mean, median, and mode 
are roughly equal, you can feel comfortable about knowing the 
typical value. 

You still need to know something about the exceptions, in 
short, the dispersion (or spread or scatter) of the entire distribu¬ 
tion One measure of spread is the range. It tells you the lowest 
and highest values. It might inform you, for example, that the 
salaries in that hospital range from $10,000 to $250,000. 

You can also divide your values into 100 percentiles, so you 
can say someone or something falls into the 10th or 71st per¬ 
centile, or into quartiles (fourths) or quintiles (fifths). One useful 
measure is the interquartile range , the interval between the 75th 
and 25th percentiles —this is the distribution in the middle, 
which avoids the extreme values at each end. Or you can divide 
a distribution into subgroups —those with incomes from $10,000 
to $20,000, for example, or ages 20 to 29, 30 to 39* and so on: 

All these values can easily be plotted. With many of the 
things that scientists, economists, or others measure—IQs, for 
example, and other test scores —we rypically tend to see a family 
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iar, bell-shaped normal distribution^ high in the middle, low at each 
end, or taiL This is the classic Gaussian curve, named alter the 
19th-century German mathematician Karl Friedrich Gauss. 
But you may also find that the plot has two or more peaks or 
clusters, a bimodal or multimodal distribution. 

A widely used number, the standard deviation, can reveal a 
great deal. No matter how it sounds, it is not the average dis¬ 
tance from the mean but a more complex figure.* Unlike the 
range, this handy figure takes full account of eveiy value to tell 
how spread out things are—how dispersed the measurements. 
In what one statistician calls a truly remarkable generalization, 
in most sets of measurement “and without regard to what is 
being measured" only 1 measurement in 3 will deviate from the 
average by more than 1 standard deviation, only 1' in 20 by 
more than 2 standard deviations, and only 1 in 100 by more 
than 2.57 standard deviations. 

“Once you know the standard deviation in a normal, bell¬ 
shaped distribution," according to Thomas Louis, “you can draw 
the whole picture of the data. You can visualize the shape of the 
curve without even drawing the picture, since the larger the 
variation of the numbers, the larger the standard deviation and 
the more spread out the curve —and vice versa." 
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Why think? Why not try an experiment? 


—John Hunter 
l&Ji-axtioy Bntxsk anatomist 


Sit down before (act as a littie child, be prepared to give up every preconceived 
notion, follow humbly wherever and to whatever abysses nature leads, or you 
shall learn nothing. 

— Thooiai Henry Huxley 


This is the part I always hate. 


— A mathematician as he approaches the 
equal sign (in a Sidney Hams canooo in 
Amman Scientist) 


JL HERE is no disease that strikes older people more tragi¬ 
cally than Alzheimer’s disease, which makes a useless tangte of 
the brain. At a prestigious New England university a research 
team imaginatively inserted catheters into the skulls of four pa¬ 
tients aged 64 to 73 to deliver a continuous infusion of either a 
theoretically promising drug or, alternately an ineffectual saline 
solution for comparison. 

After 18 months the investigators published a paper saying 
that according to observations by the patients’ families, three 
patients showed marked improvement and the fourth at least 
held his own. Fascinating, of course. Some reporters learned of 
the work and began inquiring. The investigators let a TV crew 
do a story and also held a news conference, with one patient 
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brought forth for on-cam era testimonials. Except for some 
newspapers that decided to print nothing, the story flew far and 
wide. 

The head investigator, a chief resident in neurosurgery, 
cautioned that the results, though encouraging, were “very 
early* and “certainly do not prove this is an effective treatment* 

He advised healthy skepticism. But headlines unequivocally 
read: “Alzheimer's Test Found Successful," “Alzheimer’s: A New 
Promise" “First Breakthrough Against Alzheimer’s," “Pump Of¬ 
fers Hope* “Possible Alzheimer’s Cure * 

Within two months the medical center logged 2,600 phone 
calls, mainly from desperate families, and critics began asking 
why a press conference had been held, since a study of only four 
patients^with unblinded investigators getting their assessments 
from hopeful families — meant little. 

Harvard's Eh. Jay Winsten concluded that “the decision to 
hold a press conference ... far outweighed in impact the mod¬ 
ulating effect of the investigators' qualifying language. The vis- ► 
ual impact of [one] patient’s on-camera testimonials all but 
guaranteed that TV coverage would oversell the research, de¬ 
spite any qualifying language" 1 

When dubious claims are made — about Alzheimer’s, a new 
cancer drug, a possible AIDS cure —and the claims get widely 
reported, there is commonly a lot of postmortem clucking and 
soul-searching among reporters and editors. Then someone else 
makes some sensational claim, and the same thing may happen 
all over again. 

The biggest error in medical science, according to Dr. 
Thomas Chalmers, is “the uncontrolled pilot study in which the - 
investigators try a treatment on 10 patients, and if it seems to 
work ... are tempted to report it* to fellow scientists, let alone 
the media. 2 

All science is only a stab at the truth. Even with the best of 
statistics, “We scientists don’t know how to tell the whole truth," 
Mosteller reminds us. 3 Outside this honest limitation lie vast 
realms of inadequate science with plausible-sounding yet shaky 
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statistics. A French physician, Pierre Charles Alexandre Louis, 
said 150 years ago, The only reproach which can be made to 
the numerical method” is that it ^requires much more labor and 
time than the most distinguished members of our profession” 
often give it. “Some days* says one modem statistician^ T think 
every idiot in the country who can put his hands on a computer 
program thinks he’s a statistician” 

The big problems of statistics, say its best practitioners, 
have little to do with computations and formulas They have to 
do with jydgment, were told, with how to design a study, how 
to conduct it, then analyze and interpret the results. In a day of 
frenzied media competition for the public’s eye and ear—and 
many chances to do harm by shaky reporting—journalism too 
calls for sophisticated judgment. How, then, can we have some 
hope of telling which studies seem credible, which we should 
report? 

A fundamental principle is that every conscientiously con¬ 
ducted study has a careful design: a method or plan of attack to 
include the right kind and number of patients or petri dishes 
and to try to eliminate bias. Different problems require different 
methods, and one of the most basic questions in science is, Can 
this kind of experiment !,. this design, yield the ansiver? 

This is not a simple question for a reporter to answer, but 
there is much we can know. What kinds of studies, what kinds 
of numbers and controls and methods, should we look for? 

Experiments versus Seductive Anecdotes 

Students and eggs can be graded, citizens and cities can be 
credit-rated, and scientific evidence can be weighed according to 
what has been called a hierarchy of evidence. Some kinds of 
studies carry little weight, some more, some a great deal. 

Science and medicine started with anecdotes, unreliable as far 
as generalization is concerned, yet provocative. Anecdotes ma¬ 
tured into systematic observation, the most ancient form of 
science. Observation told the ancients much about the stars, it 
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told the pharaohs’ physicians much about the sick, and it is still 
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implemented, or not completed, whether for lack of funds, diffi¬ 
culty in recruiting or keeping patients, toxidty or other prob¬ 
lems, or, sometimes, rapid evidence of a difference in effect 
(making continued denial of effective treatment to a control 
group unethical). Another 20 trials produce no noteworthy re¬ 
sults, and just 20, results worth publishing. Clinical trials none¬ 
theless are called the strongest, most precise, most decisive way 
to evaluate medical interventions and learn true causation. 
Randomized clinical trials proved that new drugs could cut the 
heart attack death rate, that treating hypertension could prevent 
strokes, and that polio, measles, and hepatitis vaccines worked. 
No doctor, observing a limited number of patients, could have 
shown these things. 

Types of clinical studies include the following: 

• Among the most reliable are parallel studies comparing 
similar groups given different treatments, or a treatment versus 
no treatment. But such studies are not always possible. 

• In crossover studies the same patients get two or more treat¬ 
ments in succession and act as their own controls. Similarly, self- 
controlled studies evaluate an experimental treatment by control 
observations during periods of no treatment or of some standard 
treatment. There are pitfalls here. Treatment A might affect the 
outcome of treatment B, despite the usual use of a washout period 
between study periods. Patients become acclimated: They may 
become more tolerant of pain or side effects or, now more 
health-conscious, may change their ways. The controls —the 
patients in a control group —don’t always behave in parallel 
studies either: In one large-scale trial of methods to lower blood 
cholesterol and risk of heart disease, many controls adopted 
some of the same methods—quitting cigarette smoking, eating 
fewer fats —and reduced their risk too. 

• Investigators often use historical controls (meaning compari¬ 
son with old records: historically the cure rate has been 30 
percent, say, and the new therapy cures 60 percent) or other 
external controls (such as comparison with other studies). These 
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controls arc often misleading — the groups compared are fre¬ 
quently not comparable, the treatments may have been given 
by different methods —but they are still at times useful. 

What Makes a Study Honest? 

Obviously, all studies, including the best, have potential 
pitfalls: 

• Lock of adequate controls is fatal if you really want to put the 
results in the bank. 

• The group or sample studied , 10 people or 10^000; must be 
Large enough to get a valid result and representative enough to 
apply to a larger population Because people vary so widely in 
their reactions, and a few patients can fool you, fair-sized groups 
of patients are usually needed! And enough of the right kind of 
subjects are needed for a suitable sample. Picking patients for a 
medical i study is no different from picking citizens to be ques¬ 
tioned! in a political poll. In both, a samplt is studied, and 
inferences—the outcome of an election, the results in patients in 
general —are made for a larger population. 

To get a large enough sample, medical researchers more 
and more try to conduct multicenler trials, which are appealing 
because they can include hundreds of patients, but expensive 
and tricky because one must try to maintain similar patient 
selection and quality control at 10 or 100 institutions. Successful 
mukicentcr trials established the value of controlling hyperten¬ 
sion to prevent strokes. They demonstrated the strong probabil¬ 
ity that less extensive surgery is as effective as more drastic 
surgery for many breast cancers. 

• The sample should be randomized— divided by some random 
method into comparable experimental and control groups. Ran¬ 
domization can easily be violated. A doctor assigning patients to 
treatment A or B may, seeing a particular type of patient, say or 
think, "This patient will be better on B* 

If treatment B has been established as better than A, there 
should be no random study in the first plkce and certainly no 
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study of that doctor's patient. When randomization is violated, 
“the trial's guarantee of lack of bias goes down the drainr says 
one critique. As a result, patients who consent to randomization 
are often assigned to study groups according to a list of com¬ 
puter-generated random numbers. 

• 76 combat bias —the influence of confounding variables — 
and get answers applicable to various populations, the sample or 
study population must often be stratified , or separated into 
groups by age, sex, socioeconomic status, and so on. Failure to 
stratify can hide true associations. The role of high-absorbency 
tampons in toxic shock syndrome was clarified only when the 
cases were broken down by precise type of tampon used. 

The identification of important subcategories of patients 
can be tricky indeed. A study of open-heart surgery patients 
may fail to separate out those who had to wait for their surgery 
But some patients die waiting, and those left are relatively 
stronger patients who do better, on the average, than those 
treated immediately after diagnosis. 

We reporters may also fail to pay attention to stratification, 
or distribution. In early 1985 the Presidents Council of Eco¬ 
nomic Advisers repotted that—to quote the page-one lead in a 
major newspaper—“elderly Americans have achieved economic 
parity with the rest of the population and no longer are a disad¬ 
vantaged group” Not for several paragraphs, now on an inside 
page, did the story note that “there’s a lot of variability? and 
older people are also “more likely . . .. to have members with 
incomes below the average of their age groups 5 In short, there 
are still many elderly trapped in poverty. 

• 75 combat bias in investigators or patients, studies should be 
blinded—to the extent feasible, single-, double- } or, best of all, triple- 
blinded ’ so that neither the doctors nor the nurses administering 
a treatment nor the patients nor those who assess the results 
know whether today's pill is treatment A, treatment B, or an 
ineffective placebo. Otherwise, a doctor or patient who yearns for 
a good result may see or fed one when the “right” drug is givem 
There is a tale of an overzealous receptionist who, knowing 
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which patients were getting the real drug and not the placebo, 
was so encouraging to these patients that they began saying they 
felt good, willy-nilly * 

Barring observant receptionists, the use of a placebo —from 
the Latin meaning “1 shall! please”—may help maintain blind¬ 
ness. Placebos actually give some relief in a third of all patients, 
on the average, in various conditions. The effect is usually tem¬ 
porary, however, and a truly effective drug ought to work sub¬ 
stantially better than the placebo. 

Blinding is often impossible or unwise. Some treatments 
don’t tend themselves to it, and some drugs quickly reveal theirs 
selves by various effects. But an unblinded test is a weaker test. 

• Finally, what makes a study honest is honesty John Bailar 
warns of deliberate or careless deceptions that seem to be uni^ 
versally accepted today, practices that sometimes have much 
value but at other times are “inappropriate and improper and, 
to the extent that they are deceptive, unethical.” Among them: 
the selective reporting of findings, leaving out some that might 
not fit the conclusion; the reporting of a single study in multiple 
fragments, when the whole might not sound so good; and the 
failure to report the low power of some studies, their inability to 
detect a result even if one existed l 7 

Dr. Charles Moertel of the Mayo Clinic says, 

Probably the majority of cancer patients treated with chemotherapy 
today are receiving regimens that have not been proved effective by 
randomized trial! . . . Many anides published in our major journals 
make claims for fantastic therapeutic accomplishments with no ran¬ 
domized controls. . . . Many, if not most, of the randomized studies . 

. . are of such poor quality that their results are unbelievable. 

Precious few have withstood the scrutiny of carefully designed 
confirmatory scientific study. 

He calls a multitude of poor methods statistical legerde¬ 
main: “the games we play, trying to squeeze out that little bit of 
breakthrough” Why the pressure to play them? “Salvation," Dr. 
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David Salsburg answers. “Fruit in this world (increases in salary, 
prestige, invitations to speak) and beyond this life (continual 
references in the citation index)”* 

Epidemiology: Hippocrates to AIDS 

Clinical studies deal with patients. Epidemiology deals with 
populations, which sometimes are large groups of patients. Epi¬ 
demiology seeks the causes of both health and disease by placing 
a population under its own kind of microscope, the epidemiologi¬ 
cal investigation. 

Epidemiological studies in many ways parallel clinical stud¬ 
ies—some studies are both—and are subject to many of the 
same pitfalls and rules, like avoiding bias and stratifying to get 
the right answers about the right subgroups. An old saw, in fact, 
goes, an epidemiologist is a physician broken down by age and 
sex. 

Epidemiology in its early days was concerned wholly with 
epidemics of typhoid, smallpox, and other infections. But epide¬ 
miologists today also ask, “What should we eat and how should 
we live to stay healthy?" and they study large groups to see how 
the healthiest and unhealthiest live. Hippocrates has been called 
the first environmentalist because he observed that it was 
healthier to live in high places than in low ones. Anticipating 
today's environmentalists, he blamed bad air and bad water and 
may have been partly right. But he failed to stratify; otherwise 
he might have noticed that the people who lived high were also 
wealthier and better nourished than those who lived low.’ 

In 1740 Percival Pott scored a famous epidemiological 
success by obser/ing the high rate of scrotum cancer in Lon¬ 
don’s chimney sweeps and correctly blaming it on their exposure 
to soot—burned organic materia], much like a smoked ciga¬ 
rette. A century later, John Snow, plotting London cholera 
cases on a map and noting a cluster around one source of 
drinking water, removed the handle from the now famed Broad 
Street pump and helped end a deadly epidemic. The 19th- 
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scene at the moment; it can’t portray an ever-changing picture 
unless frequently repeated. Questionnaires may be no better 
than the quality of the answers, written or verbal. One survey 
compared patients’ reporting of their current chronic illnesses 
with those their doctors recorded. The patients failed to mention 
almost half of the conditions the doctors detected over the course 
of a year. And whether it comes to illness, diets, or drinking, 
people tend to put themselves in the best possible light. They 
often say both yes and no to the same question in different form. 
A survey may stand or fall on the use of sophisticated ways to 
get accurate information. 

• Epidemiologists’ studies may also be prevalence studies, case- 
control studies , or cohort studies< A prevalence study, also called a current 
or cross-sectional study is a wide-angle snapshot of a population: a 
look at the rate of disease X or at toxic agent X and its possible 
effects by age, sex, or other variables. A political poll is such a 
study: A cross section of the nation is examined in a period of a 
few days. 

A case-control study examines cases and controls for a close-up of 
a diseases relationship to other factors in a small, intensively 
examined group. The nation hears of cases of toxic shock syn¬ 
drome, mainly in young women. The federal Centers for Dis¬ 
ease Control launches a field investigation to find a series of pa¬ 
tients, or cases> confirm the diagnosis, then interview them and 
their families and other contacts to assemble careful case histo¬ 
ries that cover, hopefully, all possible causes or associations. This 
group is then compared with a randomly selected but matched 
compeer group, or control group, of healthy young women of like age 
and other characteristics. 

The results need to be interpreted with great caution, but 
the case-control study is often a quick, highly useful and rela¬ 
tively easy, low-cost first approach or fishing expedition to as¬ 
semble dues about causes or even a working hypothesis. Or it 
may test some hypothesis. A case-control study pinpointed the 
use of tampons (later found to be certain high-absorbency ones) 
as the main villain in toxic shock. The relationship of cigarette 
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smoking to lung cancer, the association of birth control pills with 
blood vessel problems, and the transmission patterns of AIDS 
were identified in case-control studies that pointed to the need 
for broader investigation. 

Cohort or incidence studies are motion pictures. They pick a 
group of people, or cohort—a cohort was a unit of a Roman 
legion —often stratify or divide them into subgroups, then follow 
them over time, often for years, to see how some disease or 
diseases develop. These studies are costly and difficult. Subjects 
drop out or disappear. Large numbers must be studied to see 
rare events. But cohort studies can be powerful instruments and 
substitutes for randomized experiments that would be ethically 
impossible. You can't ethically expose a group to an agent that 
you suspect would cause a disease. You can watch a group so 
exposed. 

The noted Framingham study of ways of life that might be 
associated with developing heart disease has followed more than 
5,000 residents of that Massachusetts town since 1948. The 
American Cancer Society's 1952-55 study of 187,783 men aged 
50 to 69 v with 11,780 of them dying during that period, did 
much to establish that cigarette smoking was strongly associated 
with developing lung cancer. 10 

• Many epidemiological, as well as clinical, studies are 
handicapped because they must be retrospective. They look back 
in time—at medical records, vital statistics, or peoples recollec¬ 
tions (for example, those collected in interviews in a case-control 
study). People who have a disease are questioned to try to find 
common habits or exposures. Women with cervical cancer are 
interviewed to see how many took possibly guilty hormones and 
how many did not. People who live around a Love Canal are 
asked if they have been ill . 

Retrospective studies are notoriously unreliable. Memories 
fail or play tricks. Old records are poor and misleading. Defini¬ 
tions of diseases and methods of diagnosis vary sharply over the 
years. The patients you find may not be representative. A retro¬ 
spective study, however intriguing, generally only says that 
there may be something here that ought to be investigated. 
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(There are exceptions. Dr. Gary Friedman writes, “A retrospec¬ 
tive study can be quite reliable if based on data carefully col 1 
lected in the past. A revealing study of mortality in radiologists 
was a retrospective cohort study based on good data") 

• A prospective study, in contrast —like the Framingham and 
the American Cancer Society studies —looks forward. It focuses 
sharply on a selected group who are all followed by the same 
statistical and medical techniques. Dr Eugene Robin at Stan¬ 
ford tells how four separate retrospective clinical studies affirmed 
the accuracy of a test for blood clots in the lungs. When an 
adequate prospective clinical trial was done, most of the back¬ 
ward looks were proved wrong. 11 

• Epidemiology also includes experimental studies, the classical 
experiments of science on a larger human scale. These are typi¬ 
cally intervention studies. There is some intervention or manipula¬ 
tion; something is done to some of the subjects. 

The massive and hugely successful 1954 field trial of the 
Salk polio vaccine was a classic intervention trial and a clinical 
trial too, with 401,974 first- to third-graders assigned at random 
to either a vaccinated group or a control group injected with a 
placebo, or dummy shot —and another 947,171 children 
divided between vaccinated second-graders and unvaccinated 
first- and third-graders acting as controls. In addition, in all 
participating states or counties, the investigators studied and 
counted all cases of polio in a grand total of 1,829,916 children: 
those who had taken part in the study and those who had not. 
In the placebo areas, the study was also triple-Winded: neither 
the vaccinators, the subjects, nor the doctors who examined the 
subjects later for polio knew which children got which kind of 
shot. 12 

Another successful intervention study, a community trial, es¬ 
tablished the value of fluoridating water supplies to prevent 
tooth decay. Some towns had their water fluoridated; some did 
not. Blinding was impossible, but the striking difference in den¬ 
tal caries that resulted could not have been caused by any pla¬ 
cebo effect. 
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Questions Reporters 
Can Ask 



Just because Dr Famous or Dr Bigshot says this is what He fbuncj doesn’i mean 
it is necessarily so 

— Dr Arnold Reiman 


Ask to see the numbers, not just the pretty colors. 

— Dr. Richard Maryolin 
Natunm! Iruitiuin vf Health , 
describing PK'I seam lu rvpomrv 


W. 


HAT questions should we reporters ask —to make our 
news solid, to report the more valid claims and ignore the weak 
and phony? When a scientist or physician or anyone else says, 
Tve discovered that . . . * what should we ask? 

In 1949, a year after Britain’s National Health Service — 
‘‘socialized medicine”— was launched, my editors sent me to 
Britain to see how it was working. A bit stumped, I asked Dr. 
Morris Fishbein, the provocative genius who lbng edited the 
Journo! of the American Medical Association, “How can 1, a reporter, 
tell whether a doctor is doing a good job?" He immediately said, 
“Ask him how often he has a patient take off his shirt .” 

His lesson was plain: No physical examination is complete 
unless the patient takes off his or her clothes. Most reporters are 
not skilled statisticians, but we can ask some similarly revealing 
questions. Many of these are not even statistical, just simple 
ones that, like Fishbein’s, probe soft spots and often disclose 
either a conscientious approach or one that can’t be trusted. 

We can learn here from one method of science. We said 
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earlier that a properly skeptical scientist, starting a study and 
seeking truth, often begins with a null hypothesis—that treatment 
A is no better than treatment B, that there’s nothing there — then 
sees whether or not the evidence disproves it. This approach is 
much like the law’s presumption of innocence: It is for the prose¬ 
cutor to prove beyond reasonable doubt that the suspect is 
guilty. A reporter, without being cynical and believing nothing, 
should be equally skeptical and greet every claim by saying, in 
words or thought, “Show me* * 

If an investigator or claimant is competent and has a good 
case, you may have to ask none or very few of these questions, 
since a good scientific presentation should answer most of them 
for you. The need for a lot of questions could itself tell you 
something. 

Here arc some possible questions, then, some of them sim¬ 
ple and obvious ones, a few more technical for those who might 
want to ask them. 

How do you know? Hove you done a study? Was there an experi¬ 
ment? What is the evidence? Or is the approach just anecdotal? 

Answers like “In my experience . . . * Tn my hands . . . * 

Tve seen 20 cases . . . * and “There air four cases in our 
block ..." may be interesting, may be worth scientific investi¬ 
gation, may be worth a cautious news story, but there is not yet 
anything like certainty. 


What kind of study was it? Was there a systematic research plan or 
design? And a protocol or set of rules? 

What was the study design or method: observational , experimental , 
case-control , prospective y retrospective, or what? (See the previous chap¬ 
ter for kinds of studies and their uses and limits.) “A tot of 
people just scrounge around and try to come up with some 
conclusion without any real plan or design at the start," one 
medical editor reports. Was the design drawn before you started your 
study? What specific questions or hypotheses did you set out to test or 
answer? 



Source: https://www.industrydocuments.ucsf.edu/docs/kypxOOOO 


2023512499 



50 


QUESTIONS REPOR" 


CHAPTER 

Why did you do il that way? Do you think it was tht right kind of 
study to get tht answer to this question or problem? 

Was it a true human experiment, if possible, with comparable groups 
picked at random for comparison? If not, why not? And what was the 
substitute? 

If an investigator patiently —you hope —tells you about an 
acceptable-sounding design, that’s worth a brownie point. If the 
answer is “Huh?” or a nasty' one, that may tell you something 
else. 

Are you presenting preliminary data or something fairly conclusive 7 
Are you presenting a conclusion or a hypothesis for further study? “Pre¬ 
liminary” and “interesting” can mean “unproved” 

If the result is not reasonably conclusive, should there be further studies 
and what kind? 

How many subjectspatients, cases, or people are you talking about ? 
Are these numbers large enough, statistically rigorous enough, to get the 
answers you want? Was there an adequate number of patients to show a 
difference between treatments? Why are you calling a press conference to 
report on four patients? 

Small numbers can sometimes carry weight. And they may 
sometimes be the only ones possible. “Sometimes small samples 
are the best we can do * one researcher says. But larger numbers 
are always more likely to pass statistical muster 

The number studied can also depend on the subject. A 
thorough physiological study of five cases of some difficult disor¬ 
der may be important. One new case of smallpox would be a 
shocker in a world in which smallpox has supposedly been elimi¬ 
nated. In June 1981 the federal Centers for Disease Control 
reported that five young men, all active homosexuals, had been 
treated for Pneumocystis cannii pneumonia at three Los Angeles 
hospitals. 1 This alerted the world to what soon became the 
AIDS epidemic. 
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define the patients, or were clinical diagnoses (necessarily less reliable) used? 

Was the assignment of subjects to treatment or other intervention 
randomized? Randomization should give every patient a 50 per¬ 
cent chance of being assigned to one group or the other erf a two¬ 
armed study (one comparing two groups). Were the patients admit- 
ted to the study before the randomization? This helps eliminate bias. 
How was the randomization done? 

If the subjects weren't randomized\ why not? One statistician says, 
“If it is a nonrandomized study, a biased investigator can get 
some extraordinary results by carefully picking his subjects* 

Was there a control or comparison group? If not, the study will 
always be weaken Who or what were your controls or bases for compari¬ 
son? In other words: When you say you have such and such a result, 
what are you comparing it with? Art the study or patient group and the 
control group similar in all respects but the treatment or other variable being 
studied? 

Vogt calls “comparison of non-comparable groups proba¬ 
bly ... the single most common error in the medical and pop¬ 
ular literature on health and disease.* 2 

Do you have reason to believe your subjects and controls wen represent¬ 
ative of the general population? Or the particular population — those with 
the disease or condition you are interested in? The answers here go a 
long way toward answering these questions: To what populations 
an the results applicable? Would the association hold for other groups? 

If your groups are not comparable to the general population or some 
important populations f have you taken steps to adjust for this? Either 
statistical adjustment or stratification of your sample to find out about 
specific groups ; or both? Samples can be adjusted for age, for exam¬ 
ple, to make an older- or younger-than-average sample more 
nearly comparable to the general populace. (More on applica¬ 
bility and stratification after a bit.) 

Was the study blind? In a study comparing drugs or other forms of 
treatment with a placebo or dummy treatment , did (1) those administering 
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the treatment, (2) those getting it, and (3) those assessing the outcome know 
who was getting what, or were they indeed blinded • knowing only that they 
were comparing A and B (or A, B, and C, perhaps)? 

Could those giving or getting the treatment have easily guessed which 
was which by a difference in reaction or taste or other results? 

Not every study can be a blind study. One researcher says, 
“There can be ethical problems in not telling patients what drug 
they’re taking and the possible side effects. People are not guinea 
pigs" True enough, but a blinded study will always carry’ more 
conviction. 

Were there other accepted quality controls? For example, making 
sure (perhaps by counting pills or studying urine samples) that 
the patients supposed to take a pill really took it. 

Were you able to follow your protocol or study plan ? 

If there were questionnaires, interviews, or a survey: Were 
the questions likely to elicit accurate , reliable answers? Was it really possible 
to get accurate answers to these questions? 

Sampling is as common in medical studies as in political 
polling. Every’ study examines a sample, not the whole popula¬ 
tion. The samplfe must be reasonably accurate to give valid 
results. But badly worded questions can also distort the results. 
Respondents’ answers can differ sharply, depending on how 1 
questions are asked. Example: In one study 1,153 subjects were 
asked which is safer, a treatment that kills 10 percent of every 
100 patients or a treatment with a 90 percent survival rate? 
More people voted for the second way of saying precisely the 
same thing. 3 

People commonly give inaccurate answers to sensitive 
questions, such as those about sexual behavior. They are noto¬ 
riously inaccurate in reporting their own medical histories, even 
those of recent months. 

Ask: Did you pretest your questions for effectiveness before doing your 
actual survey ? 

Also: What was your nonresponse rate? Do you report it? 
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In any study: How many of your study subjects completed the 
course? Do you account for those who dropped out and tell why they did? 

Every study has dropouts, McMaster University's Dr. 
David Sackett says, “Patients do not disappear . . . for trivial 
reasons. Rather, they leave . . . because they refuse therapy, 
recover, die, or retire to the Sunbelt with their permanent dis¬ 
ability?* If an investigator ignores those who didn’t do well and 
dropped out, it can make the outcome look better. If those who 
died of “other causes" are listed among “survivors" of the disease 
being investigated —this is sometimes done on the theory that, 
after all, they didn’t die of the target cause —it can make a 
treatment look better unless there are equal numbers of such 
deaths in every branch of the study. 

Sackett adds, *The loss to follow-up of 10 per cent of the 
original inception cohort is cause for concern. If 20 per cent or 
more arc not accounted for, the results ... are probably not 
worth reading."* (On which Dr. Thomas Vogt comments, 
“Generally true, but utterly dependent on the situation.") 

Professor Warren Burkett of the University of Texas adds a 
few related and pointed questions: “Does the paper or publication 
contain all results of all experiments? Support for a hypothesis has 
sometimes been made to seem stronger by selective reporting . 
. . including only the data that most closely fit the theory. To 
what extent has the data offered . . . been smoothed from the raw 
data? . . . It is not unknown for researchers to dip and round 
data to make them fit [% heir] predicted results" (italics mine). 5 

How long was the study's follow-up? How long do patients ordinar¬ 
ily survive with this disease? Were your patients followed long enough to 
really know the outcomes , good or bad? 

And: How thorough was the follow-up? In one report on ame¬ 
biasis—a disease caused by an amoeba—the diagnosis was 
made by finding the amoeba in one of three consecutive stools, 
but a cure was dedared after observing just one negative stool. 
“It does pay to read with care," a medical professor observes. 
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Could your results have occurred just by chance? Have any statistical 
tests been applied to test this? 

Did you c alculate a P value? Was it favorable —. 05 or less? (Re¬ 
ported as < .05; see Chapter 3.) P values and confidence state¬ 
ments need not be regarded as straitjackets, but like jury ver¬ 
dicts, they indicate reasonable doubt or reasonable certainty 

Remember that positive findings are more likely to be re¬ 
ported and published than negative findings. Remember that a 
favorablt-sounding P value of < .05 means only that there is 
just 1 chance in 20, or a 5 percent probability, that the statistics 
could have come out this way by pure chance when there was 
actually no effect— so I in every 20 statistically significant results 
may be a misleading false positive. 

There are also ways and ways of arriving at lvalues. For 
example, an investigator may choose to report one of several end 
points; death, length of survival, blood pressure, other measure¬ 
ments, or just the padents condiuon on leaving the hospital. AD 
can be important, but a P value can be misleading if the wrong 
one is picked or emphasized. 

You might want to ask: Are all the important end points and their 
P values reported* Also: Was the test giving the P value the appropriate 
test , as planned in your written protocol , or did you finally do more than 
one kind of test? (And perhaps report only the best answer?) What 
were the other values? 

Did you collaborate with a statistician in both your design and your 
analysis? A stadsdeian’s coUaboration often may be indicated in a 
credit or footnote. 

In studies seeking cause and effect, remember that associadon 
is not necessarily causadon. Rutgers’ Dr. Michael Greenberg 
reminds us, “Mathemadcal methods cannot establish proof of 
cause and effect. They can indicate the probability that a rela¬ 
tionship occurred by chance, can sometimes quantify the exist¬ 
ing reladonship between acdons and effects, and can under the 
best circumstances be used to predict the impact of actions even 
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if the complex phenomena driving them are not understood. 

. . . View mathematical associations with a healthy degree of 
skepticism" 

A true experiment* controlling all variables, can sometimes 
prove cause and effect almost surely. This is easier in physics 
and chemistry than in human biology When, then, does a close 
association in an observational study (rather than a controlled 
experiment) indicate causation? There are several possible crite¬ 
ria that you can ask about: 

Is the association consistent? Are similar results usually found in 
different places and by different research methods? 

How strong is the association.^ If risk is an appropriate way of 
describing a particular situation: What is the relative nsk , or the nsk 
ratio? The word “strong'* is used here in its mathematical sense. 
It mainly means the magnitude of an effect or risk, the odds favor¬ 
ing the outcome of interest versus no such outcome. 

A relative nsk , or nsk ratio , compares two rates by dividing 
one by the other. In an American Cancer Society smoking study 
(see page 46) the lung cancer mortality rate in nonsmokers aged 
55 to 69 was 19 per 100,000 per year; the risk in smokers was 
* 188 per 100,000. Since 188 divided by 19 equals 9.89, the 
smokers were about 9.9 times more likely to die from lung 
cancer —their relative risk was 9.9* That’s strong! 

Is there an impressive dose-response, or causc-and-efect, curve —a 
curve or gradient that shows that the greater the exposure to the 
agent, or cause, the greater the effect? Heavy smokers are in¬ 
deed at greater risk than moderate smokers, and moderate 
smokers at greater risk than light smokers. (In some cases —this 
is an unsettled matter — there may be a threshold effect, an effect 
only after some minimum dose.) 

Another way of asking about risk and response: What is the 
correlation coefficient —the extent to which a set of measurements of 
the association is linear? A perfect linear relationship, or correla¬ 
tion, between two observations or variables would show up as a 
straight, steadily rising set of data points —in everyday language, 
a straight line on a graph. A perfect positive correlation or. 
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linear relationship, is given the value +1; +.5 would be a lesser 
but still interesting relationship;, —1 or any negative figure indi¬ 
cates an inverse or negative relationship, such as a runner’s speed 
going down as his weight goes up. A correlation of zero means 
no consistent association. 

How specific is the association? Does a supposed cause lead to 
many supposed effects? Or does an effect depend on many sup¬ 
posed causes? Such associations are less specific, and thus more 
suspect, until positive evidence piles up. Smoking indeed causes 
many effects. A lung disease, asbestosis, is most common when 
there is exposure to both asbestos and cigarette smoke. 

Does the supposed cause precede the effect? Is a supposed biological 
association cpidemiobgically plausible? One strong argument for a 
cause-and-effect relationship between high consumption of satu¬ 
rated fats and cholesterol and coronary heart disease is that 
populations on such diets generally develop more such disease 
than those on leaner diets. 

Does the association make biological sense? Does it agree with 
current biological and physiological knowledge? You can’t follow 
this test out the window. Much biological fact is ill understood. 
Alto, Mostellcr warns, u Someone nearly always will claim to see a 
[biological! or physiological] assodadon. But the people who 
know the most may not be willing to." 7 

Finally, look for the real why. Ask: Are there other possible 
explanations* Did you look for other explanations — confounders, or con¬ 
founding variables, that may be producing or helping produce the 
assodadon? Sometimes we read that married people live longer 
than singles. Does marriage really increase life span, or may 
medical or other problems make some people less likely to 
marry and also die sooner? Maybe the Dutch thought storks 
brought babies because better-off families had more chimneys, 
more storks, and more babies. 

Did you take steps to control or adjust for other possible explanations? 
Did you do a stratified analysis—* breakdown of the data by strata 
like sex, race, sodoeconomic status, geographical area, occupa- 
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liver than women because they drink more. They also have 
more heart disease, possibly because they've smoked longer, 
possibly because some hormones protea women. Only stratified 
analyses will bring out such differences. 

Did you, do an analysis (a regression or some other form of multivari¬ 
ate analysis) to try to identify the important variable or variables? Such 
analyses can often reveal the strongest associations. They can 
also be misused, and they are not always needed or appropriate. 
Some sophisticated questions, when appropriate: How many such 
analyses did you have to run to decide on the appropriate one? Sometimes 
the more analyses, the worse the study. How many variables did you 
consider? How many of these did you windup reporting? If an investiga¬ 
tor tries enough variables in a kind of statistical fishing expedi¬ 
tion, he or she is almost bound to find something, true or 
untrue. 

In cause-and-effect and other studies, ask: Has there been any 
reanalysis of the data? “Results, if possible, should be method- 
independent," Greenberg believes. “You should recalculate and 
see if the results hold up." 

A word of caution: Questions about multivariate analyses 
or reanalyses can be tricky. Whether or not to do one kind of 
analysis or reanalysis or none at all is often a matter of dispute 
among authorities. Launch the subjea with some humility. A 
reasoned answer, affirmative or negative, may tell you more 
than the answer's precise content. 


In studies of medical treatments or preventives: How did you 
know or decide when your patients were cured or improved 7 Were there 
explicit , objective outcome criteria? That is, were there firm measure¬ 
ments or test results rather than physicians’ observations in in¬ 
terviews, physical examinations, or chart reviews, all techniques 
highly subjea to great observer variation and inaccuracy? If im¬ 
provement or relief from pain—a particularly soft (hard to 
quantify) outcome measure—had to be judged by observers: 
Was there some systematic way of making an assessment? 

If two or more groups were compared for survival , was their starting 
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point the same at onset? At diagnosis? At start of treatment? Were they 
judged by the same disease definitions at the start and the same measures of 
severity and outcome 7 

Did the intervention have the good midis that were intended 7 Has 
there been an evaluation to see whether it was a useful result? 

Investigators often report that a drug or other measure has 
lowered blood cholesterol levels. Fine, but were they able to 
show that it reduced the number of heart attacks? Or was reduc¬ 
tion of a supposed risk factor itself taken to mean the hoped-for 
outcome? That may often be necessary, but the issue should be 
discussed. 

Investigators once reported that a new heart drug reduced 
the number of recurrent myocardial infarctions (heart attacks), 
fatal and nonfatal. But total mortality for all causes was higher 
in the treated group than in a placebo group. 

Public health officials may announce the success of a cam¬ 
paign to take high blood pressure measurements: X number of 
people were found to be hypertensive and were referred to their 
doctors. But how many went to their doctors? How many of 
those received optimum treatment? Were their blood pressures 
reduced? (If they were, the evidence is strong that they should 
suffer fewer strokes.) 

In short: What was the bottom line? Did you really do any good? 

To whom do your results apply 7 Can they be generalized to a larger 
population? Are your patients like the average doctors patients 7 Is there any 
basis in these findings for any patient to ask his or her doctor for a change in 
treatment? Clinic populations, hospital populations, and the 
%/orst cases" are not necessarily typical of patients in general i 
and improper generalization is unfortunately common in the 
medical literature. 

Again and again , in many of the cases cited in this chapter, 
ask : Do other studies back you up? Are your results consistent with other 
clinical and experimental findings? Have your results been repeated or 
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confirmed or supported by other studies? Or have only you been able to get 
these results? 

Virtually no single study proves anything. Two or 4 or 15 
studies add credence, especially if the diagnostic and outcome 
criteria and the people studied are similar. Consistency of results 
in humans, animals, and laboratory tests also adds credence. 

One scientist warns, however, “You have to be wary about a 
grab bag of studies with different populations and different cir¬ 
cumstances* To which Harvard’s Mosteller adds, Yes, be wary, 
but consistency across such differences cheers me up* And Dr. 
John Bailar tells us that, despite possible pitfalls, 'meta-analysis of 
several low power reports*—that is, statistically analyzing and 
integrating their results —“may come to stronger conclusions 
than any one of them alone* (italics mine).* 

Mosdy just good-9ense questions? Of course. Some of the 
most important questions of all for a reporter to ponder are 
these: What do I think? Do the conclusions make sense to me? Do the 
data really justify the conclusions? If this person has extrapo¬ 
lated beyond the evidence, has he or she explained why and 
made sense?* 

Does the investigator frankly document or discuss the possible biases 
and flaws in the study? A good scientific paper should do so. Does 
the investigator admit that the conclusion may be tentative or equivocal? Dr. 
Robert Bonich of Northwestern University says, “It requires 
audacity and some courage to say, 1 don’t know.”’* Do the authors 
use qualifying phrases? If such phrases are important, we are 
bound to include them in any responsible story. 

Ask the investigators themselves: How much weight should 
your work be given? Is it really firm? And how important? An expe¬ 
rienced science reporter says, T have found that good research¬ 
ers generally have an honest and proportionate view of their 


•Frederick Modeller disagrees with my occasional refe ren ce to good sense or 
common sense. If something is a commonsense idea, he says, "surety all would have 
thought of it. So it must be uncommon sense after all" He makes good sense. 
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own work's importance" But there arc many exceptions. 

Ask others in the same field: How do other unformed people 
regard this report —and these investigators? Are they speaking in their own 
area oj expertise, or have they shown real mastery if they have ventured 
outside it? Have their past results generally held up? And what are some 
good questions I can ask them? True, a lot of brilliant and original 
work has been pooh-poohed for a time by others. Still, scientists 
survive only by eventually convincing their colleagues 

More formally: Htis there been a review of the data and conclusions 
by any disinterested parties'* Some major clinical studies are re¬ 
viewed by independent second parties or committees. Reports 
of the National Academy of Sciences must pass muster by a 
review committee. 

Has there been peer review of the material? That is, has it been 
examined by referees who were sent the article by a journal 
editor? 

And, a very important question: Has the work been published 
or accepted by a reputable journal? If not, why not? The New England 
Journal of Medicine prints only 15 percent of the papers submitted 
to it (many, of course, are rejected because they are not of 
enough interest to the journal's readers). Many have been given 
at medical or scientific meetings, yet do not pass peer reviewers’ 
or the editors’ muster. Most are eventually published elsewhere, 
many in good journals. But there are journals and journals. 

In science as a whole, including biology and often basic 
medicalisciences, Science and the British Nature are indispensable. 
In general medicine and clinical science at the physician's level, 
the best, most useful journals arc probably New England Journal 
of Medicine, Journal of the American Medical Association, Annals of 
Internal Medicine, Canadian Medical Journal , Journal of Clinical Inves¬ 
tigation, and the British Lancet and British Medical Journal. There 
are many equally good specialty journals as well as mediocre 
ones. In epidemiology, three good sources are American Journal of 
Epidemiology, Journal of Chronic Diseases, and Preventive Medicine. 
Ask peoplfc in any field: What are the most reliable journals, 
those where you would want your work published? 
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Some of the most valuable journals to a medical reporter 
; are not journals of original publication but review publications 

like Family Practice and Hospital Practice, which mainly print sum¬ 
mary articles for practitioners. With some strong exceptions, the 
^ free-circulation — also known as controlled-drculation —journals 

and medical magazines, which depend wholly on advertising for 
| revenue, are not as rigorously screened as the traditional 

journals. They are often on top of the news, however All 
journals print clinkers sometimes. “Scientific journals are rec¬ 
ords of work, not of revealed truth," says the New England 
Journal's Dr Arnold Reiman. 10 

Read the entire journal article yourself, if there is one. Ask 
| the investigator for a copy or phone the journal. Or, assuming 

the article has already been published, look for it at a medical 
library, which can be found at any medical college, most good 
hospitals, and the headquarters of many county medical so- 
. deties. Too many news releases tout artides that read far more 

conservatively than the PR version. Many srientists go much 
j further in interviews or news conferences than they are willing 

j ' to go in their artides. A reporter asked a scientist, “Does peer 

! review of an artide put you at ease?* He said, “It should help 

put you at greater ease, but nothing puts me at ease until Tve 
read the artide." 

Most reporters can’t be scientific referees, but when you read 
an article, look jar the following: 

• A credit or footnote indicating collaboration with a statis- 
tidan, and a paragraph describing the method of statistical anal¬ 
ysis and its outcomes, such as Rvalue or confidence level, power 
to detect treatment effects, and so on. If they're in place, you can 
i at least assume that some effort was made to apply the rigors of 

| statistical analysis. If they're missing, should you beware? Some- 

j times. Sometimes the statistician is a coauthor whose spedalty 

| isn't identified. And some investigators are well versed in statis¬ 

tics. 

* • Tables and figures that tell the same story as the condu- 

sions. Sometimes they don’t. One statistician told reporters, 
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“Don’t assume that someone can interpret his own data. You 
may do better” And “muddle around in the footnotes and ap¬ 
pendices,* Mosteller advises. “You might find a few horrors. 
Thats how people found out that a much publicized study of 
public and private schools included only about 12 private, nom 
parochial schools." 

• Other things described in this chapter, such as the proto¬ 
col and study design, the criteria for admitting and randomizing 
subjects, the therapy actually received (in contrast to that 
planned in the protocol); blinding, complications, loss to follow¬ 
up, follow-up time, and any discussion of reservations or 
weaknesses. 

Ask, when appropriate: Where did the money to support the study 
comefrom? Many honest investigators are financed by companies 
that may profit from the outcome. So are some dishonest or self- 
deluding investigators. But the peddler of a biased point of view 
is as likely to be an antiestablishment crusader—or an academic 
ladder-dimber—as a corporate darling. Perhaps the best ques¬ 
tion to ask yourself is, Is this investigator a scientist or a sales¬ 
man? In any case, the public should know any pertinent con¬ 
nections 

“What proportion of papers will satisfy’ [^11] the require¬ 
ments for scientific proof and clinical applicability?” Sackett 
writes, “Not very’ many. . . . After all; there are only a handful 
of ways to do a study properly but a thousand ways to do it 
wrong" 11 

Despite impeccable design; some studies yield answers that 
turn out to be wrong Some fail for lack of understanding of 
physiology and disease. Even the soundest studies may provoke 
controversy. No study settles anything for all time. 

And according to Sackett, some “may meet considerable 
resistance when they discredit the only treatment currently 
available. . . . Clinicians may still elect to do something, even if 
it is of no demonstrable benefit. Study results may be rejected, 
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regardless of their merit, if they threaten the prestige or liveli¬ 
hood of their audience" 

Reporters need to tread a narrow path between believing 
everything and believing nothing. Also —we are reporters— 
some of the controversies make important stories. 
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Tests and Testing 



Testing is often the onty way to answer our questions, but it doesn’t produce 
unassailable, universal truths that should be carved on stone tablets. Instead, 
testing produces statistics, which must be interpreted. 

-Robert Hooke 


Who knows when thou mayest be tested’ 


— Ronald Arthur Hopwood 
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Do physicians always know what they're doing when they 
administer tests? Stanfords Dr. Eugene Robin says many tests 
*have not been properly evaluated and in fact may be useless or 
harmful ” He asks, “Is it common practice in medicine to per¬ 
form careful clinical trials before introducing tests that can affect 
the welfare of masses of patients? Sadly,, the answer is no.” 1 

A good test should detect both health and disease and do so 
with high accuracy, The measures of the value of a clinical test, 
one used for medical diagnosis, are sensitivity and specificity, or, 
simply, the ability to avoid false negatives and false positives. Sensitiv¬ 
ity is how well a test identifies a disease or condition in those who 
have it —how well it avoids false negatives , or missed cases, If 300 
people with a condition are tested and 90 test' positive, the test’s 
sensitivity is 90 percent. Specificity is how well a test identifies 
those who do not have the disease or condition — how well it rules 
out false positives, or mistaken identifications. If 100 healthy peo¬ 
ple are tested and 90 test negative, the tests specificity is 90 
percent. 

Sensitivity, in short, tells us about disease present. Specificity tells 
us about disease absent. A highly unspecific test will produce 
many false positives; a highly insensitive test, many false nega- 
64 
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